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Abstract 



The intrinsic treatment of binding in the lambda calculus makes it an ideal data struc- 
ture for representing syntactic objects with binding such as formulas, proofs, types, and 
programs. Supporting such a data structure in an implementation is made difficult by 
the complexity of the substitution operation relative to lambda terms. To remedy this, 
researchers have suggested representing the meta level substitution operation explicitly in 
a refined treatment of the lambda calculus. The benefit of an explicit representation is that 
it allows for a fine-grained control over the substitution process, leading also to the ability 
to intermingle substitution with other operations on lambda terms. This insight has lead 
to the development of various explicit substitution calculi and to their exploitation in new 
algorithms for operations such as higher-order unification. Considerable care is needed, 
however, in designing explicit substitution calculi since within them the usually implicit op- 
erations related to substitution can interact in unexpected ways with notions of reduction 
from standard treatments of the lambda calculus. 

This thesis describes a particular realization of explicit substitutions known as the sus- 
pension calculus and shows that it has many properties that are useful in a computational 
setting. One significant property is the ability to combine substitutions. An earlier version 
of the suspension calculus has such an ability, but the complexity of the machinery realiz- 
ing it in a complete form has deterred its direct use in implementations. To overcome this 
drawback a derived version of the calculus had been developed and used in practice. Un- 
fortunately, the derived calculus sacrifices generality and loses a property that is important 
for new approaches to unification. This thesis redresses this situation by presenting a mod- 
ified form of the substitution combination mechanism that retains the generality and the 
computational properties of the original calculus while being simple enough to use directly 
in implementations. These modifications also rationalize the structure of the calculus, mak- 
ing it possible to easily superimpose additional logical structure over it. We illustrate this 
capability by showing how typing in the lambda calculus can be treated in the resulting 
framework and by presenting a natural translation into the Air-calculus, another well-known 
treatment of explicit substitutions. 

Another contribution of this thesis is a survey of the realm of explicit substitution calculi. 
In particular, we describe the computational properties that are desired in this setting 
and then characterize various calculi based on how well they capture these. We utilize 
the simplified suspension calculus in this process. In particular, we describe translations 
between the other popular calculi and the suspension calculus towards understanding and 
contrasting their relative capabilities. Finally, we discuss an elusive property of explicit 
substitution calculi known as preservation of strong normalization and discuss why there is 
hope that the suspension calculus possesses this property. 



ii 



Contents 



1 Introduction 1 

1.1 The Lambda Calculus and Its Treatment of Binding 1 

1.2 The Explicit Treatment of Substitution 3 

1.3 Contributions of the Thesis 5 

1.4 Outline of the Thesis 5 

2 The Lambda Calculus 6 

2.1 The Syntax and Meaning of Lambda Terms 6 

2.1.1 Syntax 7 

2.1.2 Equality and Equivalence 8 

2.2 De Bruijn Notation 10 

2.2.1 Terms in the De Bruijn Notation 10 

2.2.2 Conversion and Equality in the De Bruijn Notation 11 

2.3 Properties of the Lambda Calculus 12 

2.3.1 /3-equivalence and Confluence 12 

2.3.2 Normal Forms and Typed Lambda Calculi 13 

2.4 Existential Variables and Substitution 15 

3 The Suspension Calculus 17 

3.1 Motivation for the Encoding of Substitutions 18 

3.2 Syntax of the Suspension Calculus 20 

3.3 Rules of the Suspension Calculus 22 

3.4 Relationship to the Original Suspension Calculus 25 

3.5 Typed Version of the Suspension Calculus 29 

3.6 Meta Variables and the Suspension Calculus 32 

3.7 Termination of Reading and Merging Rules 33 

3.8 Confluence of Reading and Merging Rules 36 

3.8.1 An Associativity Property for Environment Merging 37 

3.8.2 Proof of Confluence for Reading and Merging Rules 42 

3.9 Simulation of Beta Reduction 43 

3.10 Confluence of Overall Calculus 44 

3.11 Similarity in the Suspension Calculus 46 

4 Comparison with Other Explicit Substitution Calculi 52 

4.1 Calculi Without Merging 54 

4.1.1 The Ai;-calculus 54 



iii 



4.1.2 The As-calculus 56 

4.1.3 The Ase-calculus 58 

4.2 A Calculus with Merging: the Ac-calculus 59 

4.2.1 Suspension Expressions to Acr-expressions 60 

4.2.2 Acj-expressions to Suspension Expressions 61 

5 Preservation of Strong Normalization 64 

5.1 Preservation of Normalizability 64 

5.2 Preservation of Strong Normalization 65 

5.3 PSN in Calculi without Substitution Interaction 66 

5.4 Problems for Calculi with Substitution Interaction 67 

5.5 Status of PSN for the Suspension Calculus 70 

6 Conclusion 72 
Bibliography 75 



iv 



Chapter 1 



Introduction 

Binding and scoping of variable names is a fundamental concept in mathematics and com- 
puter science. Consider the mathematical statement that the natural numbers have no 
largest element, 

Vx e N. 3y G N. y > X 

In this statement the occurrences of x and y on the right are bound by the quantifiers on 
the left, and the scoping of these variables is important. Picking y as the value for x and 
substituting blindly would yield, 

3yen.y>y 

But this statement is no longer true because the substitution was not done correctly. The 
problem is that y is not yet in scope when the quantifier for x appears, thus any substitution 
for X cannot contain the variable y. This situation is replayed in the following computer 
science context: 

int X = 3; 
int y = X + 2; 

Here the variables x and y are bound by declaration of int x and int y, and here again 
the scoping is important: we can use x in defining y but not vice-versa. The point of these 
examples is that if we are going to represent and manipulate objects from mathematics and 
computer science, then we should use a language which has an understanding of binding 
and scoping. 

1.1 The Lambda Calculus and Its Treatment of Binding 

The lambda calculus is a notation for functions which correctly captures notions of binding 
and scoping [Chu40]. For example, the function which maps x to x + x \s encoded as 
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(Ax. a; + x), and applying this function to the argument 3 is encoded as {{Xx.x + x) 3). We 
expect that this expression is equal to 3 + 3, and indeed the lambda calculus formalizes this 
notion of equality so that the equation ((Ax. a; + x) 3) = 3 + 3 holds. Binding in the lambda 
calculus is done such that variable occurrences are bound by the closest enclosing binder 
which matches the variable name. For instance, in (Ax.Ax.x + x) the two occurrences of x 
on the right are bound by second lambda. Substitution in the lambda calculus is capture 
avoiding in the sense that a free variable, one which is not bound by a lambda, cannot 
become bound by the process of substitution. Thus it is not true that {(Xx.Xy.x) y) = 
(Xy.y). Instead, the bound variable y is first renamed before the free variable y is substituted 
which yields the true equality {{Xx.Xy.x) y) = (Xz.y). Finally note that we could have 
picked a name other than z here so long as it did not capture the free variable y, because 
renaming of bound variables is an intrinsic property of equality in the lambda calculus. 

Given this notion of equality, we can think of assigning directionality to it so that 
((Ax.x + x) 3) — > 3 + 3. This directionality gives rise to a notion of computation through 
function evaluation, which is common to most programming languages. Based on this idea, 
Landin argues that we can understand any particular programming language by translating 
it into the lambda calculus [Lan66]. Such a translation actually turns out to be a useful 
means for providing denotational semantics for programming languages based on the model 
theory for the lambda calculus developed by Plotkin, Scott, and Strachey [Sto81]. Another 
benefit of thinking of the lambda calculus as a common substrate for programming languages 
is that the essence of issues such as typing and evaluation strategies can be studied without 
the distraction of auxiliary features and nuances of any particular programming language. 

A more recent development related to the lambda calculus is its use in representing 
syntactic objects whose structure incorporates some form of binding. The motivation behind 
this is that common notions of binding in these settings can be handled using the lambda 
calculus as a data structure. For example, the content of quantifiers such as "for all" and 
"there exists" in mathematical logic can be separated into two parts: one part which is the 
binding of a variable and another part which is predicative in nature. The predicative part 
can be represented by an appropriate constant such as forall or exists and the binding part 
can be represented by a lambda abstraction. Using these ideas, the formula Vx3y(?/ > x) 
considered earlier in this chapter can be represented by the lambda calculus expression 

forall (Ax. exists (Ay. gt y x)) 

where gt is a constant representing the "greater than" relation. Using this representation. 
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logical operations related to binding can be performed using the lambda calculus. For 
example, if we want to substitute y for x in the equation 

Vx G N. 3y G N. y > X 

We do this by applying the argument of our forall constant to the variable y and then 
performing a reduction in the lambda calculus 

(Ax. exists (Ay. gt y x)) y —> exists {\z. gt z y) 

This result corresponds to the logical formula 

3z e N. z > y 

Similarity we may wish to identify our original statement with the statement 

Va G N. 36 G N. 5 > a 

and this is provided for free using lambda calculus. 

Similar notions of binding and substitution occur when representing many different 
kinds of objects such as formulas, proofs, types, and programs. The idea is that we can use 
a single meta language, the lambda calculus, to talk about all of these different objects. 
Then the correctness concerns of binding and substitution only need to be handled once, 
in this meta language, and then the benefit is shared in all other contexts. 

1.2 The Explicit Treatment of Substitution 

At the heart of the lambda calculus is a notion of substitution which respects the binding 
structure of functions in the language. The traditional presentation of the lambda calculus 
takes this substitution as a meta operation, which is impractical from an implementation 
perspective. Because of this, various methods have been developed for dealing with substi- 
tution in actual implementations. In the computational setting, what is often done is that 
an environment is kept which contains substitutions for variables. This approach has been 
successful in practice, but it restricts the possible evaluation strategies, since it expects that 
every variable encountered has a substitution available in the environment. For example, 
this expectation will not hold if we do evaluations underneath lambda abstractions since 
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the abstracted variables have no substitutions. 

In the representational setting, substitution is also a problem. Here we often perform 
unification modulo the rules of the lambda calculus, and the handling of substitution has a 
significant impact on the efficiency of this operation. Consider, for example, the unification 
(Ax.c ti) t2 = d for distinct constants c and d and some terms ti, t2, and ts. A naive 
approach requires substituting the term t2 for x throughout the the term ti, which can be 
arbitrarily large. Instead, a more sophisticated technique is to treat substitution explicitly 
and include it as part of the language. Then we can reduce this unification problem to 
c ti{t2/x) = d where {t2/x) encodes a substitution that is part of the language and not 
a meta operation. At this point we can answer "no" since c and d are distinct constants, 
and thus we avoid traversing the term ti. 

Explicit treatment of substitution introduces its own difficulties. One particular dif- 
ficulty stems from our interest in identifying expressions that differ only in the names of 
bound variables. To accommodate this, we use a nameless notation for the lambda calculus 
invented by de Bruijn [dB72]. In this notation, we replace each variable occurrence with 
an integer which counts the number of lambda abstractions between the occurrence and its 
binder. For example, the lambda term (Xx.Xy.x x y) is encoded as (A A 2 2 1). This provides 
a unique encoding of terms differing only in bound variable renaming, but now substitution 
must take into account renumberings when substituting underneath a lambda abstraction. 
For example, (A A ((A A 2) 2)) = (A A A 3). Thus any system for explicit substitutions must 
keep track of the renumbering to be done. 

The explicit treatment of substitutions also gives rise to new benefits in the lambda 
calculus. One is that we may think of merging substitutions to decrease the amount of work 
done in traversing a term. For example, reducing [{Xx.Xy.ti) t2 ts) traditionally requires 
two traversals over the term ti , but if the substitutions generated by reducing the above term 
are explicit then we can think of merging them into a single substitution before their effects 
are propagated onto ti. Another benefit is that existing procedures such as unification can 
be improved upon by mixing their operations with the operations of explicit substitutions 
[DHK95]. 

These motivations for the explicit treatment of substitutions have been recognized by re- 
searchers, resulting in a wide variety of explicit substitution calculi [Fie90, NW90, ACCL91, 
KR95, BBLRD96, KR97, DGOlb]. This thesis focuses on the suspension calculus [NW90]. 
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1.3 Contributions of the Thesis 

In this thesis we use the suspension calculus as a viewpoint into the realm of explicit 
substitution calculi. The particular contributions of this thesis are 

A simplification and rationalization of the suspension calculus 

The suspension calculus includes not only an explicit representation of substitutions, but 
also a mechanism for combining such substitutions. This combining is realized through a 
seemingly complex set of rules, and this apparent complexity has lead to the development 
of derived calculi [Nad99]. These derived calculi work well for reduction, but they lack an 
essential property and therefore they cannot be used to perform unification in a setting 
with a special interpretation of instantiable variables. This thesis offers another possibil- 
ity by simplifying these combination rules so that the full calculus is practical to use in 
implementations, while retaining all of the essential properties of the original suspension 
calculus. These changes have the added benefit of rationalizing the calculus so that a typed 
version of the calculus is possible, and translations to other calculi are now feasible. 

A comparison of explicit substitution calculi 

A wide variety of explicit substitution calculi have been proposed, but no systematic at- 
tempt has been made to compare these calculi and analyze their essential features. In this 
thesis we outline desirable properties for explicit substitution calculi and use these to orga- 
nize a survey of the more popular calculi. We also explore the relationship between these 
calculi and the suspension calculus by defining explicit translations for expressions. These 
translations serve as a means of understanding the notation and rules of each calculus using 
the framework of the suspension calculus. 

1.4 Outline of the Thesis 

The rest of this thesis is organized as follows. In the next chapter we review the lambda 
calculus and introduce the terminology associated with it that we will use in later chapters. 
We then describe the suspension calculus in Chapter 3 and prove its key properties. In 
Chapter 4 we survey other explicit substitution calculi and compare them with the sus- 
pension calculus. Then in Chapter 5 we discuss the property of preservation of strong 
normalization, which is an important issue for explicit substitution calculi. Finally, in 
Chapter 6 we review the contributions of this thesis and discuss possible avenues for future 
research. 



Chapter 2 



The Lambda Calculus 

The lambda calculus is a language of functions — a simple and concise syntax for describing 
a powerful and expressive language. As discussed in the introduction, this calculus has two 
related uses at a practical level: (1) to represent syntactic objects that have a functional 
structure in a way that captures their functionality and (2) by interpreting representations 
of functions essentially as rules for calculations, it also supports the ability to encode com- 
putations in a fundamental way. In short, we call these two uses "representation" and 
"computation," respectively. The goal of this chapter is to define the calculus and present 
notation and properties that support these two different uses. 

In the first section, we define the lambda calculus and various notions of equality on 
lambda terms. Following that, we look at the de Bruijn notation for the lambda calculus 
which precisely captures the most fundamental of these equality notions. In the third sec- 
tion, we look at properties of the lambda calculus which are important to its consistent and 
meaningful use in the roles of representation and computation. Finally, we look at adding 
instantiable variables to the calculus to support higher-order unification in the representa- 
tional setting. The results presented in this chapter are well established in the literature 
and proven, for instance, in [BarSl]. 

2.1 The Syntcix and Meaning of Lambda Terms 

Is X + X a function? The answer depends on the context. In an equation such as x + x = 
4, we think of x + x as a value, not a function. On the other hand, in the statement, 
"x + a; is primitive recursive," we are thinking of the function which maps x to x + x. 
Church resolved this ambiguity by introducing an explicit notation for functions, the lambda 
calculus [Chu40]. In the lambda calculus we denote a function mapping 2: to x + x by 
(Ax.x + x). The lambda in this expression creates an abstraction over x so that x is a 
bound variable within the subexpression x + x. Juxtaposition denotes application, and so 
((Ax.x + x) 5) is the application of our function to the argument 5. Later in this section 
we will introduce notions of equality that capture our intuition about function equality and 



6 



2.1. THE SYNTAX AND MEANING OF LAMBDA TERMS 



7 



evaluation. Surprisingly, abstraction and application along with these notions of equality 
can model all computable functions. 

With the syntax above, we can now use functions in a first-class way. That is, we can 
use functions not only to manipulate input and produce output, but also as the input and 
output of other functions. Consider the function {Xx.{Xy.x + y)). This is a function which 
when applied to some input, produces another function as output. This allows us to encode 
functions of more than one argument by nesting functions of exactly one argument. As 
another example, consider (A/. (Ax./ x)). This function takes a function / and an argument 
X and returns the application of / to x. In this instance, we are thinking of a function as 
input to another function. In this way, functions in the lambda calculus are first-class and 
higher-order — an expressive notion captured by simple syntax. 

2.1.1 Syntax 

Formally, the terms in the lambda calculus are defined by the following. 
Definition 2.1.1 (Lambda terms). Terms in the lambda calculus are defined by 

t ::= c \ X \ {t t) \ (Xx.t) 

where c ranges over some enumerable set of constants and x over some enumerable set of 
variables. 

We call {t t) an application and {Xx.t) an abstraction. To differentiate between variables 
and constants, we denote constants by letters like o, b, c or by appropriate symbols, and we 
denote variables by letters like x, y, z. To reduce the number of parenthesis we need to 
write, we follow the convention that application is left associative and the scope of lambda 
extends as far to the right as possible. For example, {Xx.{{x y) z)) is written as {Xx.x y z). 
We will sometimes include optional parenthesis when they aid readability. 

The syntax in Definition 2.1.1 is a solid starting point for the lambda calculus, but 
there are many notions we need to build on top of this. First, the definition of syntax yields 
an obvious definition of subterm which we will assume. Second, the syntax of the lambda 
calculus contains a notion of binding which we must define explicitly. This is crucial because 
the binding structure of a term is its most essential part. All variable occurrences within a 
term are either free or bound. The basic rule of binding is that a variable occurrence of x 
is free if and only if it does not occur within the scope of a Ax, and all free occurrences of 
the variable x in the term t are bound by the lambda in Xx.t. 
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When picking names for bound variables, we must be careful that they do not conflict 
with the free variables in a term. Thus the following definition will be very useful. 

Definition 2.1.2 (Free variables). The set of free variables of a lambda term t, denoted 
fv{t) is defined recursively as follows, 



If fv{t) = then t is called a closed term. 

Now that we have a clear notion of the binding structure of terms, we can relate terms 
based on this structure. In the next section, we look at notions of equality based on the 
binding structure of terms. 

2.1.2 Equality and Equivalence 

The introduction of the lambda calculus as a language of functions gives us some pre- 
conceived notions of how lambda terms are related. One is that we expect the specific 
variable names used for bound variables to be irrelevant. Concretely, we consider the terms 
[Xx.x + x) and {Xy.y + y) equal, because they represent the same function. 

We also expect a notion of equality under function evaluation. So the terms {{\x.x+x) 5) 
and 5 + 5 should be equal in this sense. Informally, we want to say that the term {Xx.ti) t2 
is equal to the term ti[t2/2;] where \t2/x\ is an operator which replaces all free occurrences 
of X in ti with t2- This substitution operator must respect the binding structure of both ti 
and t2. Specifically, we respect the structure of ti by not substituting t2 in for any bound 
occurrences x, and we respect the structure of t2 by not allowing any free variables in t2 
to become bound in ti. These restrictions are what gives rise to the various branching 
conditions in the following definition. 



fv{x) 
fv{\x.t) 

fv{ti t2) 



{x] 

fv{t)\{x} 

fv{ti)Ufv{t2) 



Definition 2.1.3 (Substitution). The substitution operation [s/x] which replaces the vari- 
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able X with the term s is defined recursively as 



c[s/x] 



c 



{h t2)[s/x] 



y[s/x] 



is if x = y 
y otherwise 

{tl[s/x]) {t2[s/x]) 



{\y.t)[s/x] 



Xy.t 

Xy'.it[y'/y][s/x]) 
Xy.{t[s/x]) 



if x = y 

if y & fv{s), where y' ^ fv{y) and y' 



otherwise 



^x 



The middle case for {Xy.t)[s/x] takes advantage of our notion of equality for terms that 
differ only in the names of bound variables. In this case, we do not want a free occurrence 
of y in s to be captured by the binding lambda, so we rename the bound variable before 
performing the substitution. 

With a definition of substitution in place, we can go back and formally define what we 
mean by terms that differ only in the names of bound variables. 

Definition 2.1.4 (a-equivalence). A term s results from a term r by a-conversion if s can 
be obtained from r by replacing some subterm of the form Xx.t by one of the form Xy.{t[y/x]) 
where y is a variable that is not free in t. Two terms s and r are said to be a-equivalent, 
written as s =a r, if one can be obtained from the other by a (possibly empty) sequence of 
a- conversions. 

We can also formalize a notion of equivalence under evaluation that we suggested before. 

Definition 2.1.5 (/3-equivalence) . A term s results from a term r by a P- contraction, 
denoted r l>^ s, if s can be obtained by replacing a subterm of r of the form (Xx.ti) t2, 
referred to as a (3-redex, by ti[t2/x\. We say also that s results from r by a jS-reduction, 
denoted r \>p s , if it can be obtained from r by a (possibly empty) sequence of a- conversions 
and (3 -contractions. The term r is said to result from s by a ^-expansion if s results from 
r by a 0- contraction. Finally, r and s are said to be (3-equivalent of one results from the 
other by a sequence of a-conversions, P- contractions, and (3 -expansions, and we denote this 



Note that /?-contraction invokes substitution which may cause the renaming of some 
bound variables. For this reason we include a-conversion in our definition of /3-reduction 



by r =p s. 
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and /3-equivalence. 

A final notion of equality that we might expect from functions is that of extensional 
equivalence. That is, given r x =13 s x we might expect a notion of equality that says r and 
s are equal. First note that /3-equivalence is not powerful enough for this. For instance, the 
terms {Xx.f x) and / are related in this way, but they are not /3-equivalent. It turns out 
that we can get this extensionality property through the following equivalence notion. 

Definition 2.1.6 (r/-equivalence) . A term s results from a term r by a r]- contraction if s 
can be obtained by replacing a subterm ofr of the form {Xx.f x), referred to as a r]-redex, by 
f , where x is not free in f . The term r is said to result from s by a rj-expansion if s results 
from r by a r]- contraction. Finally, r and s are said to be r]-equivalent of one results from 
the other by a sequence of a -conversions, f3- conversions, rj- contractions, and rj- expansions, 
and we denote this by r =jj s. 

Throughout this section we have referred to our equality notions as equivalence relations, 
and the following theorem justifies this. 

Theorem 2.1.1. =/3, and =rj are equivalence relations. 
2.2 De Bruijn Notation 

Substitution in the lambda calculus is complicated by the possibility of variable names con- 
flicting. De Bruijn notation is a nameless notation for the lambda calculus which abstracts 
away many of these issues [dB72]. The purpose of variable names in the lambda calculus 
is to associate variable occurrences with their binder. The de Bruijn notation makes this 
association by counting the number of lambdas that occur between a variable occurrence 
and its binder in the abstract syntax tree. In this way, names are removed both from vari- 
able occurrences and from binders. For example, the term (Xx.{\y.y) {Xz.x)) is encoded as 
(A (A^l) (A 7^2)). This term has the same content as the original, but we have abstracted 
away the specific variable names. 

2.2.1 Terms in the De Bruijn Notation 

Formally, the terms in the lambda calculus are defined by the following. 

Definition 2.2.1 (De Bruijn terms). Terms in the de Bruijn notation are defined by 

t::=c\ #i \ {t t) I (At) 
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where c ranges over an enumerable set of constants and i, called an index or variable 
reference, ranges over the natural numbers. 

As with the definition of lambda terms, we cah {t t) an appUcation and (At) an abstrac- 
tion. We also drop parenthesis by assuming application is left associative and the scope of 
a lambda extends as far right as possible. In addition, we assume the obvious definition of 
subterm. 

One issue which we have not yet clarified is how to deal with free variables. Because 
free variables have no binders associated with them, it is not obvious how to assign them an 
index. To handle this we will assume a fixed listing of the free variables of a term, which we 
think of as a list of top level binders for the free variables. Thus the term {Xx.y x) where y 
is free will be encoded as (A #2 #1). In this term, the #2 refers to the first free variable. 

2.2.2 Conversion and Equality in the De Bruijn Notation 

A pleasant property of the de Bruijn notation is that we get a-equivalence for free, i.e., any 
two terms which are a-equivalent in the lambda calculus have the same representation in 
the de Bruijn notation. Thus the complicated a-equivalence check in the lambda calculus 
has been replaced by a simple syntactic equality check in the de Bruijn notation. 

We must also reconsider /^-contraction in this notation. Given the de Bruijn /3-redex 
((AM) N) we want to think about substituting N for the first free variable in M. But in 
performing this contraction, we have also eliminated a lambda which was previously over 
the term M. Thus all the free variables in M will have to have their index decremented 
by one. Also, we may have to substitute beneath some lambdas which will require us 
to renumber all the free variables in N . This leads us to consider a generalized notion of 
substitution which allows us to substitute for every free variable in a term. 

Definition 2.2.2 (De Bruijn substitution). Let t be a de Bruijn term and let 81,82^82, ■ ■ ■ 
be an infinite sequence of de Bruijn terms. The result of simultaneously substituting Sj for 
the ith free variable in t is denoted by S{t; si, S2, S3, . . •) and is define by, 

1. S{c; si, 82, S3, . . .) = c, for any constant c, 

2. S{i^i; si, 82, S3, . . .) = 8i, for any index jfi, 

3. S{{ti t2); 81,. $2, S3, . . .) = S{ti; si, S2, S3, . . .) S{t2; 81,82, S3, . . .), and 

4. S{{Xt); si, S2, S3, . . .) = A S{t; #1, s[,8'^, s'^, . . .) where 8[ = S{8i; #2, #3, #4, . . .). 
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The interesting case in this definition is when we descend underneath an abstraction. 
Within this abstraction, the index #1 should be left untouched, the index #2 should re- 
fer to what we were substituting for the first free variable, the index #3 should refer to 
what we were substituting for the second free variable, etc. Further, since the arguments 
being substituted in are going to be placed under this additional lambda, we need to incre- 
ment the indices of all free variables in those arguments. Given this extended definition of 
substitution, we can define /3-conversion for the de Bruijn notation. 

Definition 2.2.3 (De Bruijn /3-conversion) . A term s results from a term r by a f3- 
contraction, denoted r s, if s can be obtained by replacing a subterm of r of the form 
((Ati) t2), referred to as a (i-redex, by S{ti;t2,#l,i^2, . . .). We say also that s results from 
r by a (3-reduction, denoted r\>^s, if it can be obtained from r by a (possibly empty) sequence 
of [3 -contractions. The term r is said to result from s by a (3-expansion if s results from 
r by a 0- contraction. Finally, r and s are said to be (3-equivalent of one results from the 
other by a sequence of jS- contractions and (3 -expansions, and we denote this by r =p s. 

While convenient from the implementation standpoint, the de Bruijn notation is not 
particularly readable for humans. For example, the term {\x.[\y.{\z.x) x) x) is encoded as 
(A (A (A #3) #2) #1). A naive glace at this term may suggest that three different variables 
are referenced, even though the same variable is referenced each time. Similarly, the term 
{Xx.{\y.{\z.z) y) x) is encoded as (A (A (A#l) #1) #1). Again, a naive glace may suggest 
that the same variable is referenced three times when in fact each reference is to a different 
variable. For the rest of this thesis we shall use the de Bruijn notation, but because of the 
reasons stated above we will often present examples using the named lambda calculus. 

2.3 Properties of the Lambda Calculus 

The /3-equivalence rule performs the "heavy-lifting" of the lambda calculus. In the compu- 
tational setting we use /3-equivalence to determine the value of a computation, and in the 
representational setting we use /^-equivalence to determine if two terms represent the same 
thing. In this section we look at properties of the lambda calculus which make it suitable 
for use in both of these settings. 

2.3.1 /3-equivalence and Confluence 

Determining /3-equivalence is seemingly difficult because we can use both /3-contraction and 
/^-expansion. Using /3-expansion is impractical since we can apply it anywhere in a term, 
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thus we will try to restrict ourselves to /3-contraction which limits us to considering only 
the /9-redexes of a term. In making this restriction, we may fear that we lose completeness, 
i.e., that two terms are /3-equivalent, but there is no common term to which they /3-reduce. 
The next theorem addresses this fear and assures us that such a situation cannot occur. 

Before we can state the theorem, we need to introduce a diagram notation which is 
common in rewriting systems such as the lambda calculus. In such systems, we often have 
statements of the form "Let P and Q holds, then R and S are true" where P, Q, R, and 
S denote some relationships between terms. We represent this in a diagram by drawing 
solid arrows for the given properties, P and Q, and using dashed arrows for the resulting 
properties, R and S. 

Theorem 2.3.1 (Church-Rosser property). Let ti and t2 be lambda terms such that ti =p 
t2. Then there exists a term t^ such that ti t^ and t2 >^ ts, i.e., the follow diagram holds. 



t2 



\ 



/r 



/3 \ / 
\ / 

This theorem tells us that there is no gap in completeness if we restrict ourselves to 
/3-reduction when determining /3-equivalence. This result is equivalent to the following 
property, 

Theorem 2.3.2 (Confluence). Let ti, t2> ^3 be lambda terms such that ti [>^ t2 and ti [>^ t^. 
Then there exists a term t^ such that t2 t^ and ts \>p t^, i.e., the following diagram holds. 



t2 



y 



2.3.2 Normal Forms and Typed Lambda Calculi 

The Church-Rosser property gives us some guidance in determining /^-equivalence by al- 
lowing us to consider only /3-reduction. A general method for determining /^-equivalence 
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is still incomplete since the process of /3-reduction can go on indefinitely, such as for 
{{Xx.x x) (Xx.x x)). Thus we are often interested in a subset of lambda terms for which 
/?-contraction is no longer applicable. This is the content of the following definition and 
theorem. 

Definition 2.3.1 (/3-normal form). A lambda term is in (3-normal form if it has no (5- 

redexes. If ti and t2 are lambda terms such that ti t2 and t2 is in (3-normal form, then 
we say that t2 is the [3-normal form of ti . When there is no ambiguity, we may call this 
simply the normal form of ti . 

Theorem 2.3.3. 13-normal forms are unique. 

An even stronger property than having a /3-normal form is for a term to be strongly 
/3-normalizing. This means that any sequence of /3-reductions is terminating and therefore 
reaches the /3-normal form. When terms are strongly /^-normalizing, we can determine (3- 
equivalence by reducing to /3-normal form and comparing for equality. Thus, when dealing 
with strongly /3-normalizing terms, we have a complete decision procedure. 

One system for ensuring terms are strongly /3-normalizing is the simply typed lambda 
calculus [Chu40]. In this calculus we have a set of types which we assign the constants in 
the language. In addition, we annotate all lambdas with a type which represents the valid 
argument type of the function. Using this, we can build up and assign types to a whole 
range of terms. To begin with, we define the types in our language. 

Definition 2.3.2 (Simple types). Simple types are defined by 
T ■.:=a \ T ^T 

where a ranges over a nonempty set of base types. 

We use the names A and B to represent types. We assign a type A to a term t by 
creating a typing judgement which says that ^ is a valid type for t. The form of this 
judgement is T hs t : A where F, called the context, contains type assignments for the 
free variables and S, called the signature, contains type assignments for the constants. The 
context maintains type assignments for free variables using a stack of types A\.A2. ■ ■ ■ .An.9 
which represent types for the free variables ^l, #2, . . . , ^n, respectively. The rules for 
constructing these typing judgements are given in Figure 2.1. Notice that in these rules, 
the lambdas are annotated with the type of their arguments. 
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r hs c : ^ 



where c : A £ T, 



A.T He #1 : A 



r hs #(i -1):A' 
A.T hs #i : A' 



where i > 1 



Thj^h-.B^A r hs t2 : ^ 
r hs {h t2) : A 



A.T hj:t:B 



Thj:{XAt):A^B 



Figure 2.1: Typing rules for the simply typed lambda calculus 



The two key theorems about the simply typed lambda calculus are that /3-reduction 
preserves typing and that typed terms are strongly /^-normalizing. 

Theorem 2.3.4 (Preservation of types). If T t : A and t \>j3 t' then F hs t' : A. 

Theorem 2.3.5 (Strong normalization of typed terms). If T t : A then t is strongly 
(3 -normalizing. 

The simply typed lambda calculus is only one example of a typing system which may be 
layered on top of the lambda calculus. Many other typing systems have been developed and 
put to use in real systems. For instance, AProlog [NM91] uses a polymorphic simply typed 
lambda calculus [NP92] and Twelf [PS99] uses the a dependently typed lambda calculus 
[HHP93]. 

2.4 Existential Variables and Substitution 

The representational use of the lambda calculus produces a need for variables which can 
be instantiated. For example, suppose we want to express a rewrite rule in logic which 
says Va;.(F A G{x)) F A \/x.G{x). Such a rule allows us to pull a term out of a universal 
quantifier if it does not contain the variable being quantified over. Using the ideas discussed 
in the introduction of this thesis we can think of encoding the left-hand side of this rule 
as for all (Xx.F A (G x)) where F and G are variables which will be instantiated upon 
matching this rule with a specific instance. These variables are called meta variables and 
determining appropriate substitutions for them is called higher-order unification. 

The logical interpretation of meta variables requires that substitutions cannot capture 
variables. Thus in our example above, if F or G contain free occurrences of x, then the 



2.4. EXISTENTIAL VARIABLES AND SUBSTITUTION 



16 



bound variable x must be renamed before the substitutions are performed. This restriction 
is actually a benefit to the use of meta variables because it allows us to have precise control 
over the possible variable occurrences within a substitution. For instance, although G 
cannot bind the variable x in our example, x is provided as an argument so that G can 
be of the form {Xx.G') where G' is a term which has free occurrences of x. The process 
of /3-contraction then ties the knot and associates the free occurrences of x in G' with the 
argument x in (G x). For example, consider substituting (Xx.x + x > x) for G. Then the 
left-hand side becomes 

forall {Xx.F A {{Xx.x + x > x) x)) [>p forall {Xx.F A {x + x > x)) 

Conversely, since x is not an argument to F we know that F does not depend on x and so 
moving the F outside of the quantifier Vx is a logically sound operation. 



Chapter 3 



The Suspension Calculus 



The lambda calculus revolves around /3-contraction. In turn, /3-contraction depends on a 
monolithic substitution operation which is impractical for use in actual implementations. 
A solution to this is to move the substitution operation down from the meta level into 
the object level, i.e., make it explicitly part of the syntax of the lambda calculus. This 
allows for direct manipulation of substitutions and therefore fine-grained control over the 
substitution process. The focus of this chapter is on a specific explicit substitution calculus: 
the suspension calculus. 

The first half of this chapter, extending from Section 3.1 to Section 3.6, introduces the 
suspension calculus and possible variations on it. We start this introduction by motivating 
the notation used to encode substitutions in the suspension calculus and defining notation 
and rules based on this motivation. We then describe the relationship between the suspen- 
sion calculus of this thesis and the original suspension calculus from which it is created. 
Next we show how the suspension calculus admits typing in the style of the simply-typed 
lambda calculus. We then discuss different ways in which meta variables can be added to 
the calculus. 

The second half of this chapter, comprising the remaining sections, deals with proving 
important properties of the suspension calculus. First, we prove that the rules governing 
substitution are terminating, thus reflecting the finite nature of substitution in the lambda 
calculus. Second, we show that these substitution rules are confluent, thus the choices we 
make in performing substitution always result in the same normal form. Third, we show 
that the suspension calculus faithfully models the process of /3-reduction in the lambda 
calculus. Fourth, we prove that the property of confluence extends to the full suspension 
calculus, thus making it a candidate for new approaches to unification. Finally, we prove 
a property intrinsic to the suspension calculus which relates different ways of representing 
substitutions. 



17 



3.1. MOTIVATION FOR THE ENCODING OF SUBSTITUTIONS 



18 



3.1 Motivation for the Encoding of Substitutions 

Before we formally and explicitly define the syntax of the suspension calculus, it is beneficial 
to consider what information needs to be reflected into the syntax. Note that we are 
doing this reflection in the context of the de Bruijn notation since it provides a unique 
representation of a-equivalent terms. The first difficultly in this respect is that substitution 
in the de Bruijn notation is an operation with infinitely many arguments. For instance we 
have the /3-contraction rule which say 



The substitution here is for infinitely many variables and thus no naive embedding of this 
into the syntax will work. Instead, it helps to start with a fresh view of the substitution 
needed for /3-reduction. 

Consider the following term for which we want to perform /3-reduction but delay its 
effect on t, 



Here we have a redex with S2 as an argument and within the body of this redex we have 
another redex with si as an argument. We want to consider the effect on t of contracting 
these redexes. That is, we wish to produce a term of the following form: 



where t' is an encoding of the term t together with the information needed to perform 
the substitutions generated by contracting the two redexes. We call t' a suspension since 
it represents a suspended substitution. The information in this suspension consists of 
substitutions for some variables and renumberings for the other variables. In order to 
express this information, it helps to distinguish between two types of variables in t: those 
that referred to a variable bound within the outermost lambda that is contracted and those 
that are free with respect to the outermost lambda. We use the outermost lambda as 
the reference point for this information since we hope to make all information local to the 
contractions being made, i.e., the context of our redexes should not affect the term t'. In 
the discussion that follows, we shall fixate on the term starting with the outermost lambda 
that is contracted and ignore the context in which it occurs. 



((Ati) t2)>/3S(ti;t2,#l,#2,...). 



(...((A...(A...((A...t...) si). ..)...) .2)...) 



(3.1) 



(...(... (A. ..(...t' 



(3.2) 
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For the variables that are free there is a renumbering which must be done to account 
for the lambdas which occurred before the reduction that are gone after the reduction. In 
order to define this, we introduction the idea of an embedding level which is the depth of a 
term counted by the number of lambdas between it and the top level. The embedding level 
of t in (3.1) is called the old embedding level, ol, and the embedding level of t' in (3.2) is 
called the new embedding level, nl. For example, if the lambdas shown are the only lambdas 
in the term, then old embedding level is 3 and the new embedding level is 1 . The number of 
lambdas that are removed is ol — nl and thus this is the renumbering to be done on the free 
variables. Furthermore, we can easily find the free variables since they are the ones whose 
index is greater than their embedding level, e.g., if #3 occurs at embedding level 2 then 
we know it occurs underneath two lambdas and represents the first free variable outside of 
these lambdas. 

In addition to renumberings for the free variables, we must provide substitutions for the 
bound variables. In the case that a bound variable does not need a substitution (since its 
binding lambda is not contracted) we will create a dummy substitution which substitutes 
the first free variable, #1, thus preserving the term. Now, the terms within substitutions 
often come from a different embedding level than where we are thinking of substituting 
them and must be renumbered so that their free variables are not captured by the context 
into which we substitute. In order to do this, for each substitution we keep a number 
indicating the embedding level from which it came, say /. Then when we need to perform a 
substitution we increment all free variables in the substituted term hy nl — I. We keep these 
substitutions together with their embedding levels in a list, ordered from the first bound 
variable to the last. This allows us to simply add a dummy substitution to the front of this 
list in order to shift all indices when we descend underneath an abstraction. 

Given the previous information for encoding substitutions, we write our encoding as 
\t, ol,nl, e] where t is the term over which substitution is performed, ol is the old embedding 
level, nl is the new embedding level, and e, called the environment, is a list of substitutions 
for the bound variables. The overall term \t,ol,nl,e\ is called a suspension. If we looked 
further into e, its structure would be (ti, h) :: {t2, h) nil where the ti are terms and 

li are the corresponding embedding levels. 

The general operation of the suspension calculus will be to perform /3-contractions which 
produce suspensions and then to apply rewriting rules which move these suspensions deeper 
and deeper into the tree until they are applied to a final term. During this, it is natural 
that we might encounter a term of the form oli, nli, ei], ol2,nl2, 62], which represents a 
term t together with two substitutions to be performed on it sequentially. We can think of 
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merging these two substitutions to produce a term of the form [t, o/', n/', e'], i.e., a single 
suspension which encodes the information of both of the previous ones. This requires careful 
consideration of the values for ol' and nl' together with new syntax to represent the shape 
of e'. 

We can determine values for ol' and nl' by thinking carefully about the embedding 
levels in the term o/i, n/i, ei], ol2,nl2, 62]. Here we have two substitutions over t which 
possibly overlap with each other. In detail, the effect on t is first to substitute for the first 
oil free variables and then to raise all the free variables by nli. This raising has the effect 
of embedding the term within nli abstractions. Then the second substitution walks over 
the resulting term, substituting for the first 0I2 free variables and then raising all the free 
variables by 72/2- Note that some of these 0I2 substitutions will be vacuous since the free 
variables have all been raised by nli. In fact, if nli > 0I2, then all of the substitutions of 
the second suspension are vacuous. In this case we have ol' = oli since only the first oli 
substitutions are performed. Also, nl' = nl2 + (n/i — 0I2) since the raising done is that 
of nl2 and nli minus the 0I2 vacuous substitutions. In the case when nl\ < 0I2, we have 
ol' = oil + (0/2 — nil) since we have the oli substitutions and all but the first nli of the 
0/2 substitutions. Also, nl' = n/2 since all of the nli raisings are consumed by the 0/2 
substitutions and so the only raising left over is from the second suspension. These two 
branching cases for ol' and nl' can be coalesced by using the minus operator on natural 
numbers. Then we have ol' = oli + {0I2 — nli) and nl' = n/2 + {nl2 — oli) in both cases. 

We must also determine the shape of e' after merging {ft, oli,nli, eij, ol2,nl2, 62} into 
|t, ol' , nl', e'J. The result should roughly be the substitutions of ei, modified by the substi- 
tutions in 62, together with some tail portion of 62. For each term in ei, we can use nli to 
compute the number of abstractions in which it is embedded. Using this we can prune off 
the first elements of 62 which correspond to these abstractions. When we have done this for 
all elements of ei, we only have left to determine which tail portion of 62 to include. The 
length of this tail portion should be the number of abstractions consumed by the second 
suspension, 0I2, minus the number of abstractions created by the first suspension, nli. Thus 
we can compute the total shape of e' by knowing only ei, 62, 0I2, and nli. We write the 
resulting form as |^ei, n/i, 0/2, 62^" ^'^d we call this a merged environment. 

3.2 Syntax of the Suspension Calculus 

In this section we formally define the syntax of the suspension calculus and present measures 
for assessing the wellformedness of expressions in this calculus. We begin by defining the 
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"pre-syntax" of suspension expressions, which is the raw syntax without any constraints on 
wellfor medness . 

Definition 3.2.1 (Pre-syntax of suspension expressions). The "pre-syntax" of suspension 
expressions is given by the following definitions of the syntactic categories of "terms" and 
"environments. " 

t ::= c j #i I (t i) I (At) | |t,n,n,e] 
e ;.■= nil \ {{t,l) :: e) \ §e,n, n,e|- 

Here c ranges over an enumerable set of constants, i ranges over the natural numbers, and 
n and I ranges over the non-negative integers. 

We call ^i a variable index or reference, and we call {t t) and (At) application and 
abstraction, respectively. The term |t, n,n, e] is called a suspension. The operation :: is a 
consing operator on lists and the {t,l) component of ((t,/) :: e) is called an environment 
term. Finally, §e, n, n, e§ is called a merged environment. We collectively refer to all terms, 
environments, and environment terms generated by the above definition as suspension ex- 
pressions. We will drop parenthesis by assuming application is left associative, the scope of 
a lambda extends as far right as possible, and :: is right associative. We will also assume 
the obvious definition of subexpression. 

In order to move from pre-syntax to syntax, we need to consider constraints on expres- 
sions so that they "make sense." For example, in a term of the form [t, oli, nli, ei] we are 
thinking of performing substitution for the first oli free variables using the substitutions in 
ei. Thus the number of substitutions in ei, called the length of ei, must be oli. Also, for 
each substitution in ei we include an embedding level from which the substitution came. 
Since substitutions will always move inward, it must be that we keep moving to deeper 
embedding levels. Thus our current embedding level, nli, must be greater than or equal to 
the embedding level of every environment term in ei. We enforce this by defining a measure 
called the level of ei. These measures are the content of the following definitions. 

Definition 3.2.2 (Length of an environment). The length of an environment e is denoted 
by len{e) and is defined recursively by 

len{nil) = 
len{{t, n) :: e) = 1 + len{e) 
Zen({{ei, 71/1,0/2,62}}) = len{ei) + (0/2 -n/i) 
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Definition 3.2.3 (Level of an environment or environment term). The level of an environ- 
ment e is denote lev{e) and is defined recursively by 

lev{nil) = 
lev{{t, n) :: e) = n 
lev{^ei,nli,ol2, e2§) = lev{e2) + {nh - 0I2) 

Given these two definitions, we can define the syntax of suspension expressions 

Definition 3.2.4 (Syntax of suspension expressions). The syntax of suspension expres- 
sions is those expressions in Definition 3.2.1 with the following additional wellformedness 
constraints, 

1. In any subexpression of the form |t,o/,n/,e], we must have len{e) = ol and lev{e) < 
nl. 

2. In any subexpression of the form -^ei, n/i, 0/2, 62^; we must have len[e2) = 0I2 and 
lev{ei) < nil. 

3. In any subexpression of the form {t,n) :: e, we must have lev{e) < n. 
3.3 Rules of the Suspension Calculus 

Now that we have created syntax which embeds substitutions we can consider rules which 
operate on this modified syntax. The effect of these rules should be in three parts: (1) 
to contract beta redexes to produce suspensions, (2) to move these suspensions down in 
the syntax tree until they can be applied, and (3) to merge suspensions and compute the 
resulting merged environment. The complete rules of the suspension calculus are listed in 
Figure 3.1 and are divided into three categories described above: the (3s rules, the reading 
rules, and the merging rules. 

The f3s rule simulates /3-contraction in the suspension calculus using the suspension 
syntax to encode the effect of substitution. This rule rewrites the (3-redex ((Ati) ^2) to 

1,0, {t2,0) :: nilj which says that we substitute t2 in for the first free variable in ti and 
decrement all other free variables by one. 

The second category of rules is the reading rules, (rl)-(r6), which provide a means for 
moving suspensions down in a term and also performing substitutions. Taken together, the 
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iPs) ((Atl) t2) ^ [ti,l,0,(t2,0) ■.■.nilj. 

(rl) [c, ol, nl, e] — > c, provided c is a constant. 

(r2) y^i, 0, nl, nil} where j = i + nl. 

(r3) ol, nl, [t, I) :: e] — > |t, 0, n/', nii\, where nV = nl — I. 

(r4) l#i,ol,nl,{t,l) ::el ^ oZ', n/, e], 

where i' = i — 1 and ol' = ol — 1, provided i > 1. 

(r5) [(ti t2),o^"'^,e] ^ (|ti,o/,nZ,e| [t2, o/, n/, e]). 

(r6) [(At),o/,n/,el ^(AIt,o/',nr,(#l,n/') :: ej), 
where ol' = ol + 1 and nl' = nl + 1. 

(ml) [|i,o/i,n/i,ei],o/2,"'^2,e2| ^ |t, o/', n/', §ei, n/i, 0/2, e2§l, 
where o/' = oli + (0Z2 - nh) and n/' = n/2 + {nli - 0I2). 

(m2) l^ei , n/i , 0, nil^ — > ei . 

(m3) ^nil, 0, 0/2, 62}} ^ 62- 

(m4) l^m/, n^i, 0Z2, {t, I) ■■ 62}} ^ fnil, nl'^,01'2, €2^, 

where nl'^ = nli — 1 and 0/2 = 0/2 — 1, provided n/i > 1. 

(m5) i{t,n) :: ei,nli,ol2,{s,l) :: 62}} i{t,n) :: ei, n/'^, o/^, e2§, 
where n/^ = — 1 and 0/2 = 0/2 — 1, provided nli > n. 

(m6) i{t,n) :: 61,71,0/2, (s,0 - 62}} ^ (|t,o/2,/, :: 62], m) :: §61,71,0/2, (s,0 :: 62^, 
where m = / + (n — 0/2)- 

Figure 3.1: Rewrite rules for the suspension calculus 

Ps rule and the reading rules form an adequate simulation of the de Bruijn calculus, as we 
shall prove in Section 3.9. 

The merging rules, (ml)-(m6), are the final category of rules. The (ml) rule enables 
us to merge two suspension into a single suspension, at the cost of creating a merged 
environment. The rules (m2)-(m6) then allow us to evaluate this merged environment in a 
lazy way. 

Definition 3.3.1. The reduction relations generated by the rules in Figure 3.1 are denoted 
by \>i3^, >r, O'Ti'd \>m- The relations [>rm, l>r/3s; ^^'^ l>rm/3s ^'^fi the appropriate unions of 
those relations. If R corresponds to any of these relations then we will use R* to denote its 
reflexive and transitive closure. 

The following example illustrates a use of these rules where t, si, and S2 are arbitrary 
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suspension expressions. This example is a simplified version of (3.1). 
(A A ((At) si)) S2 

D>;jA[t,l,0, (si,0) ::m/],l,0,(s2,0) :: nilj 

>A lit, 1, 0, (si, 0) :: mlj, 2, 1, (#1, 1) :: (s2, 0) :: nilj 

D>^A {t, 3, 1, §(si, 0) :: nil, 0, 2, (#1, 1) :: (s2, 0) :: nil}}} 

t>:^X [t, 3, 1, ([si, 2, 1, (#1, 1) :: (s2, 0) :: nilj, 1) :: (#1, 1) :: {s2, 0) :: nilj 

The outermost suspension here encodes three substitutions to be made over t. The first 
substitution is si, modified by substituting S2 for its second free variable. The second 
substitution is a dummy substitution which corresponds to the lambda that remains after 
contraction. Finally, the last substitution corresponds to substituting in S2 and since the 
embedding level of this is one less than the new embedding level, we will have to raise all 
the free variables of S2 which corresponds to our substituting of S2 underneath a lambda. 

In order for our rules to make sense, they need to produce terms that make sense. In 
formal terms, we need to ensure that using a rule on a well- formed expression produces a 
well-formed expression. 

Theorem 3.3.1. Let e be a well-formed suspension expression and let e t>rmi3s^' ■ Then e' 
is a well-formed suspension expression. 

Proof. This property must be proved simultaneously with two other properties: if e is 
an environment then lev{e) > lev{e') and len[e) = len{e'). The reason is that if we 
could rewrite an environment so that its level increases or so its length changes, then we 
might break the wellformedness of an expression such as [t, ol, nl, ej where lev{e) < nl and 
len{e) = ol by rewriting e to e' such that lev{e') > nl or len{e) ^ ol. The proof of all three 
properties simultaneously is a simple case analysis on the rewrite from e to e'. We give two 
examples here in order to give the flavor of the argument. 

Consider (r3) and suppose that the left hand side is well-formed. This means that 
ol = len{[t,l) :: e) and nl > lev{{t,l) :: e) = /. In order for the right hand side to be 
well-formed we must have = len{nil) and nl' > lev{nil) = 0. The first is trivially true, 
and the second requires that we show nl — I > which follows from nl > I. 

As a second example, consider (m6). Assuming the left hand side is well formed yields 
n > lev{{t,n) :: ei), 0/2 = len{[s,l) :: 62), and lev{et) > lev{e). In order to show the 
right hand side is well-formed we must have / > lev[{s,l) :: 62), 0I2 = len{[s,l) :: 62), 
lev{et) > lev{{s, I) :: 62), m > lev{^ei,n, 0I2, (s, I) ■■ 62!, n > lev{ei), 0I2 = len{{s, I) :: 62), 
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and / > /ef (62). All of these follow directly. Since (m6) is a rule on environments, we must 
also verify that this rewriting does not cause the level of this environment to increase. The 
level of the left hand side is lev{{s, I) :: 62) + (n — 0I2) and the level of the right hand side 
is m = / + (n - 0/2)- Since / = lev{{s, I) :: 62), this follows easily. Finally, it is plain to see 
that the length is preserved. □ 

From here on we will speak only of well-formed suspension expressions and thus drop 
the "well-formed" label on them. The final definition and lemma of this section are ones 
that allows us to massage environments into a convenient form for use with the rewrite 
rules. This will be of importance in Section 3.4 and Section 3.8. 

Definition 3.3.2 (Simple environments and truncating). A simple environment is one of 
the form (to)^o) - (^i)^i) - ••• - {tn-iJn-i) nil, with n possibly zero. For < i < n, 
we write e{i} to denote the truncated environment with the first i elements removed, i.e., 
(ti, li) (tn-ii ^n-i) nil, with i possibly zero. We extend this notation by letting e{i} 

denote nil in the case that i > len(e) for any simple environment e. 

Lemma 3.3.1. Let e be an environment. Then there exists a simple environment e' such 
that et>^e' . 

Proof. The proof is by case analysis on the structure of e, basically showing that if e is not 
a simple environment then we can always apply a rule (m2)-(m6) to it. Connecting this to 
the result requires that we also know that the rewrite rules are terminating, which is shown 
in Section 3.7. □ 



3.4 Relationship to the Original Suspension Calculus 

The current suspension calculus is based on the original suspension calculus by Nadathur 
and Wilson [NW98]. There are two differences between these calculi: the way dummy 
substitutions are handled in the rule (r6) and the way merging is performed using (m2)- 
(m6). In this section we highlight these differences and explain the connection between 
these two calculi. Before we begin, however, we note that the original suspension calculus 
had a separate syntactic class for environments terms because it had more possibilities for 
such expressions. Hence, in this section we will treat environment terms as a separate 
syntactic class in both calculi. 

The first difference is how dummy substitutions are handled in the case of rule (r6). 
When pushing a suspension underneath an abstraction there is a need to generate a sub- 
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stitution for the bound variable. Since we are not actually contracting a redex, this sub- 
stitution should have no net effect and is thus a dummy substitution compared to those 
substitutions generated by /3-contraction. In the original calculus, the rule for pushing a 
suspension underneath an abstraction had the form 

[A t, ol, nl, e| ^ A {t, ol + 1, n/ + 1, @n/ :: e| 

This new @nl environment term was conceived of as an optimization to separate real and 
dummy substitutions, and it also has the benefit of simplifying the proof of termination for 
the reading and merging rules. Nevertheless, we can simplify the calculus significantly by 
using the environment term (^^1, n/ + 1) instead. This environment term has the same effect 
the same effect and also allows us to have a simpler system since we can exclude all the old 
rules for manipulating @nl forms. With this change, the results of the original suspension 
calculus paper still hold, which we will assume [NW98]. 

The larger difference between the two calculi is the way merging is performed using 
(m2)-(m6). Taking into account the change above, the fig and reading rules are the same 
in both calculi, but the merging rules are still significantly different. The merging rules for 
the original calculus are presented in Figure 3.2. The rule (m8') makes use of a measure 
called the index which we will say more about later in this section. For now, it is sufficient 
to think of the index as a measure very similar to the level. Pay special attention to (m5') 
which says 

^et :: ei, n/i, 0/2, ((et, n/i, 0/2, £2)) :: {{ei, n/i, 0/2, 

The effect of this rule is to eagerly propagate the effects of 62 onto the environment term et 
by creating a new environment term ((et, n/i, 0/2, 62)). This new form is then used to prune 
62 using rules (m6') and (m7') until only the portion relevant to et was left. In the current 
calculus, this pruning is done using the form £ei, n/i, 0/2, using rule (m5) and thus the 
work of this pruning is shared for each environment term of ei . This change in the current 
calculus allows us to use fewer syntactic forms and also fewer rules, both of which reduce 
the energy required to understand the calculus. 

A possible downside of these simplifications is that we might lose some desirable theo- 
retical properties of the calculus. This turns out not to be the case as we prove later in this 
chapter. In fact, the simplifications to the calculus allow new results such as a typed version 
of the calculus in Section 3.5 and a translation to another explicit substitution calculus in 
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(ml') llt,oli,nli,eij,ol2,nl2,e2j ft, o/', n/', f ei, n/i, 0/2, 62!], 
where ol' = oli + {0I2 - nli) and nl' = nl2 + {nli - 0^2)- 

(m2') ^nil,nl,0,nil^ ^ nil. 

(m3') ^nil, 0, ol, e§ —i- e. 

(m4') ^nil, nl, ol, et :: e§ — > §ni/, nl' , ol' , e§, 

where nZ, ol > 1, n/' = nl — 1 and oZ' = 0/ — 1. 

(m5') -g^et :: ei, n/, o/, 62 J ^ {{et,nl, 01,62)) ■■ fei,nl,ol,e2j. 

(m6') {{et,nl,0,nil)) ^ et. 

(m7') ((et, n/, o/, et' :: e)) ^ ((et, nl' , ol', e)) , 

where nl' = nl — 1 and o/' = oZ — 1, provided ind{et) < nl. 

(m8') {{{t,nl),nl,ol,et :: e)) ^ (|t,o/,/',et :: e],m) 
where Z' = ind{et) and m = I' + [nl — ol). 



Section 4.2. In the rest of this section, we will look at a more formal relationships between 
our calculus and its predecessor. 

Suspension expressions in our context are a subset of the original suspension expressions. 
The only difficultly in showing this is that wellformedness rules for the original suspension 
calculus are defined the same as in Definition 3.2.4 except with the measure level replaced 
by the measure index, defined as follows. 

Definition 3.4.1 (Index of an environment or environment term). Given a natural number 
i, the i-th index of an environment e is denoted by indi{e) and is defined as follows: 

1. If e is nil then indi{e) = 0. 

2. If e is (t, k) :: e' then indi{e) is k if i = and mdj_i(e') otherwise. 

3. If e is ^ei,nl,ol,e2^, let m = [nl — indi{ei)) and I = len[ei) . Then 



Figure 3.2: Merging rules for the original suspension calculus 



indi{e) 



indm{e.2) + {nl — ol) if i < I and len{e2) > m 
< indi{ei) if i < I and len{e2) < m 

^ ind(^i_i+ni){e2) ifi>l- 



The index of an environment, denoted by ind{e), is indo{e). 



3.4. RELATIONSHIP TO THE ORIGINAL SUSPENSION CALCULUS 



28 



Intuitively, the index of an environment is the embedding level of its first environment 
term and zero if the environment is nil. The index of an environment term et is the value 
of n for which et>J^(t, n). We can consider the index measure on expressions in the current 
calculus as well. Note that this measure is equal to the level on environments terms in 
the current calculus because they already have the form {t,n). The only difference in the 
current calculus between index and level is in the case of merged environments, where the 
level is an upper bound on the index. 

Lemma 3.4.1. Let e be a well-formed environment in the current calculus. Then lev{e) > 
ind{e). 

Proof. First generalize to lev{e) > indi(e) for all i. Then the proof proceeds by induction 
on the structure of e. □ 

Lemma 3.4.2. Well-formed terms in the current suspension calculus are well-formed in 
the original suspension calculus. 

Proof. This is a direct consequence of the previous lemma. For example, if {t, ol, nl, e] is a 
well-formed term of the current calculus then we have ol = len{e) and nl > lev{e) > ind{e). 
Thus it is a well-formed term in the original suspension calculus. □ 

We now know that the well-formed expressions we are working with in the current 
calculus are also well- formed in the original calculus. We can also show that the rules of 
the current calculus are either derived or admissible rules for the original calculus. We will 
consider each rule of the current calculus in turn, ignoring rules that are the same in both 
calculi, i.e., (ml), (m3), and (m4). 

The rule (m2) operates on an environment of the form §ei, nZi, 0, nil^. By Lemma 3.3.1, 
the environment ei can be rewritten to the form eto eti :: ... :: etn-i nil. Then by 
applying (m5') n times we can get 

{{etQ,nli,0,nil)) :: {{eti,nli,0,nil)) {{etn~i,nli,0,nil)) :: ^nil,nli,0,nilj 

Using (m2') and n applications of (m6') this rewrites to eto :: eti :: ... :: et^-i :: nil. Thus 
both ^ei,nli,0,nil^ and ei can rewrite to a common term and therefore the rule (m2) is 
admissible. 

For rule (m5) we cite the original suspension paper where Lemma 6.10 states that 
^ei,nl -\-l,ol -\- l,et :: 62 J and §ei, o/, 62 J rewrite to a common term if ind{ei) < nl 
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[NW98]. Restricting this to the case where ei = {t,n) :: e[ and restating the requirement 
as n < nl + 1 yields the rule (m5). Thus (m5) is admissible. 

Finally consider (m6). Assuming ^{t,n) :: ei,n,ol2,et :: is a term in the current 
suspension calculus, we can rewrite it using (m5') followed by (m8') to produce 

{lt,ol2,l,et :: 62}, m) :: iei,n,ol2,et :: 62} 

where I = ind(et) and m = I + (n — ol2). Since et is an environment term in the current 
calculus we have I = lev{et) and thus (m6) is a derived rule of the original calculus. 

We can formalize our observations into the following theorem which will be useful in 
Section 3.9 when we show that the current calculus properly simulates /3-contraction. 

Theorem 3.4.1 (Same normal form). Let x be a well-formed (current) suspension expres- 
sion. Then the ^rm-normal form of x is also the [>rm' -normal form of x. 

Proof. This theorem depends on result that the reading and merging rules are confluent 
and terminating in both calculi (See Section 3.7 and Section 3.8 for the current calculus, 
[NW98] for the original calculus). The result then follows by induction on the rewrite 
sequence which takes x to its >rm-normal form. □ 

3.5 Typed Version of the Suspension Calculus 

In this section we present a typed version of the suspension calculus which was not previously 
possible with the original suspension calculus. This typed version is is consistent with the 
simply-typed lambda calculus from Section 2.3.2 and motivated in the same way. In fact our 
typing judgment for terms of the form c, (A t), and (ti ^2) is the same as in the simply-typed 
lambda calculus, but significant complexity is introduce to handle the new judgment for 
|t, , n/, e] . The issue is that we must interpret the term t in the context of the substitutions 
encoded in e and relative to nl. This necessitates a new judgment which talks about the 
effect of an environment (relative to an embedding level) on a typing context. The form 
of this judgment is T e >„; V . Where F and F' are contexts, S is a signature, e is an 
environment, and nl is an integer for the embedding level. All of typing rules are presented 
in Figure 3.3. 

The following result establish the wellformedness of our typing judgments. 
Theorem 3.5.1. Type judgments are preserved by the rewrite relations 
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Thyn-.A 



where c : A £ T, 



A.r #1 : ^ 



r hs #(i -I) -.A' . 



r hs ti : B ^ A r hs t2 : ^ 
r hs (ii t2) : A 



A.T hyt:B 
Thj:{XAt):A^B 



r hs [t,oZ,n/,e| : A 



r hs i>o r 



r hs ni/ l>„i_i r' 



^.r hs m/ >„i r 



7 where nZ > 



rhE {t,n) :: e>„z„ir' 



7 where nl > n 



rhs (t,n) :: e>n Ar' 



rhs iei,nr,ol',e2}}l>ni T" 



Figure 3.3: Typing rules for the typed suspension calculus 



Proof. The critical thing to show is that if £ — > r is a rewrite rule and T i : A (respectively 
r hs ^ \>ni r') then T \—£ r : A (respectively T \-y r \>ni T'). The proof proceeds by cases on 
the rewrite rule and we focus here on a few interesting cases. 

Let the rewrite rule be (/9s). Then we are given that T \—£ {{XA_ti) t2) ■ B from which 
we know the derivation tree must be 



A.r hyti-.B D2 



Ths iiXAh) t2):B 



Where Di and D2 are appropriate derivations. We can then use these derivations to con- 
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struct the following typing judgment. 

r hs t2 : ^ r hs nil [>o T 

Thj: {t2,0) ■.:nil\>oA.r A.T hj: h : B 

rhs [ti,l,0, (t2,0) ■.:nilj 

This completes the argument for (Ps)- 

Consider the case of (r3). Then we must have the following typing derivation. 
Di D2 

r hs ^ : ^ r hs e 

rhs {t,l) -.-.OiA.T' 



Ar,i-i.--- -A+i-rhs {t,l) ■.■.e>ni-iA.T' 

A^i.A^i_^.--- .A+i.rhs ::e>„;Ar AF #1 : ^ 
Ani.A^i-i. ■ ■ ■ .Ai+i.T hs [#l,o/,n/,(t,0 :: e\ : A 

From this we can construct the typing judgment, 

r hs nil c>o r 



Ani~i- ■ ■ ■ -A+i-r hs nil \>ni~i~i r Di 
Ani-Ani-i. ■ ■ ■ .AiJ^i.T hs nil >ni^i r r hs t : ^ 
Ani.Ani-i. ■ ■ ■ -A+i.rhs [t,0,n/-Z,m/] : A 



As a final example, consider the rule (m3) which yields the following typing derivation. 
Di 

r hs 62 >ni hs >o 

rhs i:m/, 0,0/2, 62 ;^>„i r' 

And this directly contains the needed typing judgment. 
Di 



□ 



3.6. META VARIABLES AND THE SUSPENSION CALCULUS 



32 



3.6 Meta Variables and the Suspension Calculus 

The syntax of suspension expressions does not presently allow for meta variables, as de- 
scribed in Section 2.4. We can remedy this by adding modifying the syntax of terms to 

t ::= V \ c \ \ {t t) \ (Xt) \ p,n, n,e], 

where v represents the category of meta variables. Because such variables have a logical 
interpretation, any substitution for them must avoid capturing local variables. This means 
that any outer context cannot effect the value of such variables. To accommodate this 
interpretation, we can add the following to our reading rules: 

(r7) {v, ol, nl, e] — > u, if w is a meta variable. 

This rule has the same character as the (rl) rule for constants, and as a result, all the prop- 
erties of the suspension calculus without logical meta variables carry over to the suspension 
calculus with logical meta variables. 

An alternative interpretation for meta variables in one in which substitution is per- 
formed without renaming to avoid variable capture. This is referred to as the graftable 
interpretation of meta variables, and it has been found useful in higher-order unification 
procedures [DHK95]. The benefit of using graftable meta variables is that dependencies of 
meta variables on the outside context no longer need to be explicitly specified, which turns 
out to be a significant cost in the traditional higher-order unification algorithm [Hue75]. On 
the other hand, it may seem that using graftable meta variables removes any possibility of 
control over dependencies, but we can in fact enforce some restrictions using explicit raising. 
For example, to prevent the metavariable X from depending the the de Bruijn indices ^1 
and #2, we can replace it with the term [X, 0, 2, nilj. This also us to simulate logical meta 
variables using graftable meta variables simply by lifting such graftable variables so that 
they can not depend on any of the bound variables in the term. 

In order to support graftable meta variables in the calculus, we extend the syntax as 
before, but instead of adding the rule (r7), we leave the rules unchanged. This is consistent 
with the graftable interpretation: because we know nothing about such meta variables, we 
cannot say what effect a suspension will have on them. Adding graftable meta variables to 
the calculus, however, introduces some new complications with respect to confluence. For 
example, consider the term ((A ((AX) ti)) in which X is a graftable meta variable and 
ti and t2 are terms in Oj-^-normal form. This term can be rewritten to 

I[X,l,0,(ti,0) ::7iiZ|,l,0, (t2,0) ■.-.nil} 
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and also to 

I[X,2,1,(#1,1) :: (t2,0) ::m/],l,0,(Iti,l,0,(t2,0) ::m/],0) :: nilj, 

amongst other terms. It is easy to see that these terms cannot now be rewritten to a 
common form using only the reading and (Ps) rules. The merging rules are essential to this 
ability and, as we see in Section 3.8, these also suffice for this purpose. Another impact 
of graftable meta variables is that normal forms with respect to the reading and merging 
rules now include the possibility of remaining suspensions, whereas without graftable meta 
variables such normal forms are always de Bruijn terms. 

We assume henceforth that the suspension calculus includes meta variables under the 
graftable interpretation. For reasons already mentioned, it is easy to see that the properties 
we establish for the resulting calculus will hold also under the logical interpretation. 

3.7 Termination of Reading and Merging Rules 

The first significant property we prove for the suspension calculus is that the reading and 
merging rules are terminating, i.e., that all >rm-sequences are finite. This is useful in all 
future sections because it allows us to induct on >rm-sequences, and it tells that >rm-iiormal 
forms always exist. On a deeper level, the substitution process in the lambda calculus is 
inherently finite and by showing that our elaboration of this substitution process is also 
finite, we argue convincingly for the well behaved nature of our rules. 

We prove the termination of the reading and merging rules in two steps. First, we 
describe a collection of first-order terms and a wellfounded order on those terms using a 
lexicographic recursive path ordering [Der82, FZ95]. Second, we define a mapping from 
suspension expressions to newly described terms such that each of the reading and merging 
rules produces a smaller term with respect to the defined order. The desired conclusion 
follows from these facts. The key points of this work have been verified in Coq.^ 

We imagine the terms we describe here to be an abstract view of the suspension calculus 
such that only details relevant to the termination of the reading and merging rules are 
considered. These terms are constructed using the following (infinite) vocabulary: the 
following (infinite) vocabulary: the 0-ary function symbol *, the unary function symbol 
lam, and the binary function symbols app, cons and, for each positive number i, Sj. We 
denote this collection of terms by T. We assume the following partial ordering □ on the 
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signature underlying T: Sj □ Sj if i > j and, for every i, Si □ app, Si □ lam, Sj □ cons and 
Sj □ *. This ordering is now extended to the collection of terms. 

Definition 3.7.1 (Term Order). The relation y onT is inductively defined by the following 
property: Let s = /(si, . . . , Sm) and t = g{ti, . . . , both s and t may be *, i.e., the number 
of arguments for either term may be 0. Then s >- t if 

1. f = g (in which case n = m), (si, . . . , s„) )^iex {ti, ■ ■ ■ ,tn), and, s y ti for all i such 
that I < i < n, or 

2. f Z\ g and s >- ti for all i such that 1 < i < n, or 

3. Si = t or Si y t for some i such that 1 < i < m. 

Here >-iex denotes the lexicographic ordering induced by >-. 

In the terminology of [FZ95], >- is an instance of a recursive path ordering based on 
□ . It is easily seen that □ is a well-founded ordering on the signature underlying T. The 
results in [FZ95] then imply the following: 

Lemma 3.7.1. y is a well-founded partial order on T. 

We now consider the translation from suspension expressions to T. The critical part 
of this mapping is the treatment of expressions of the form [t, ol, nl, e] and ^ei,nl, ol, 62^- 
Because of (ml) and (m6), there is a tight relationship between the encoding of these two 
types of expressions. It turns out that we can ignore the differences between them when 
looking at our abstract view of the suspension calculus, thus we drop the ol and nl from 
each and encode them as Sj for some i. 

To determine the appropriate value for i in Si, we must consider how this i will be 
needed. We will focus on the case for [i,oi,ni,e], but the same ideas will carry over to 
§ei, n/, oZ, e2§. A first attempt to translate [t, ol,nl,e] as Si for some fixed i would fail 
since the rule (rS) would not yield a smaller term when applied due to the lexicographic 
component of our ordering. Instead, we use the the value of i as a coarse measure of the 
remaining substitution work, so that this value decreases in the rule (r3). In order for 
this to work we must count the 7^1 on the left-hand side of (r3) for some positive amount. 
This then passes the problem onto (r6) where we add a 7^1 to the right-hand side. In 
order to balance this, we assign lambdas a weight based on the number of suspensions in 
which they are embedded. This results in our family of measures rjj where j represents the 
number of levels of suspensions or merged environments that the current term is embedded 
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underneath. In defining this measure we must be aware that the rule (ml) allows the 
environment 62 to become embedded underneath an additional level. Thus the embedding 
level of an environment must be based on the number of suspensions in the term over 
which the environment applies. We call this count of suspensions the "internal embedding 
potential" and denote it by fi in the following definitions. In these definitions, max is the 
function that picks the larger of its two integer arguments. 

Definition 3.7.2. The measure fi that estimates the internal embedding potential of a 
suspension expression is defined as follows: 

1. For a term t, is if t is a constant, a meta variable or a de Bruijn index, fi{s) 
if t is (A s), max{fi{si), fi{s2)) if t is (si S2), and iJ,{s) + ij,{e) + 1 iftis {s, oZ, nZ, e] . 

2. For an environment e, iJ,{e) is if e is nil, max(//(s), //(ei)) if e is {s,l) :: ei and 
//(ei) +/i(e2) + 1 if e is {{ei, n/, o/, e2§. 

Definition 3.7.3. The measures rji on terms and environments for each natural number i 
are defined simultaneously by recursion as follows: 

1. For a term t, rji{t) is 1 if t is a constant, a meta variable or a de Bruijn index, r/j(s) + l 
ift is (As), max{'qi{si),rii{s2)) + 1 if t is (si S2),and r]i^i{s) +r]i^i^^^^^{e) + 1 if t is 
[s, oZ, nZ, e] . 

2. For an environment e, i]i{e) is if e is nil, max{r]i{s),r]i{e\)) if e is {s,l) :: e\ and 
r/i+i(ei) + 7?j+i+^(ei)(e2) + 1 if e is §ei, n/, oZ, 62^. 

Definition 3.7.4. The translation 8 of suspension expressions to T is defined as follows: 

1. For a term t, £{t) is * if t is a constant a meta variable or a de Bruijn index, 
app{£{ti),£{t2)) ift is {ti t2), lam{£{t')) ift is (At') and Si{£{t'),£{e')) where i = 
r]o{t) ift is lt',ol,nl,e'j. 

2. For an environment e, £{e) is * if e is nil, cons{£{t'),£{e')) if e is {t' ,1) :: e' and 
Si{£[ei),£{e2)) where i = rjQ^e) if e is -^ei, n/, o/, 62^. 

Using this translation we now lift our ordering on this collection of first-order terms to 
suspension expressions. 

Definition 3.7.5 (Suspension Expression Order). For suspension expressions s and t we 
say s t if and only if £{s) >- £{t). 
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There key properties of the >- relation carry over to the ^ relation. First, subexpressions 
are smaller than their parent expressions. Second, the relation is monotonic in the sense 
that if V results from u by replacement of a subpart x hy y such that x ^ y, then u ^ v. 
Third, the relation is wellfounded. These properties together with the following theorem 
will make this relation a powerful tool for performing induction over suspension expressions. 

Theorem 3.7.1. Every rewriting sequence based on the reading and merging rules termi- 
nates. 

Proof. A tedious but straightforward inspection of each of the reading and merging rules 
verifies the following: If Z — > r is an instance of these rules, then I ^ r, > fj,{r), and, 
for every natural number i, rji{l) > r}i{r). Further, it is easily seen that if x and y are both 
either terms or environments such that > and rji{x) > rji{y) for each natural 

number i and if v is obtained from u by substituting y for x, then rji{u) > r]i{v) for each 
natural number i. From these observations it follows easily that if ti[>rmt2 then ti ^ t2- 
The theorem is now a consequence of Lemma 3.7.1. □ 



3.8 Confluence of Reading and Merging Rules 

In this section we prove that the reading and merging rules are confluent, i.e., that the 
choices we make in rewriting can always be reconciled. Thus c>rm-iiormal forms are unique, 
which is another argument for the coherence of our reading and merging rules. 

The property of confluence states that given terms /, g, and h such that f>*mg and 
f\>*^h, there exists a term k such that g\>*^k and h\>*^k. This can be expressed using the 
diagrams described in Section 2.3.1 as, 




A well known result (proved, for instance, in [HueSO]) in rewriting is that a terminating 
rewriting system is confluent if it is weakly confluent. Weak confluence is states that given 
terms /, g, and h such that f\>rmg and f>rmh, there exists a term k such that g\>*.^k and 



h>*„^k, i.e., that the following figure holds. 
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/ 



9 



rm 



V 



h 



rm 



This is much easier to show since we only need to consider one rewrite step from f to g and 
from / to h. In doing this, we must consider each possible overlap between any two rules. 
The most complicated of these is the overlap of (ml) with itself when applied to a term of 
the form 

I [|t , o^i , n/i , ei 1 , 0/2 , n/2 , 62! , 0/3 , n/3 , e] 

This leads us to develop an associativity property for merged environments which is the 
content of the following section. 

3.8.1 An Associativity Property for Environment Merging 

Here we show that the following two environments rewrite to a common environment. 

^ = Hei,n/i,o/2,e2}},n/2 + (n/i - 0/2), 0/3, 63 J 

B = {{ei,n/i,o/2 + (0/3 - nl2), §62, ra/2, 0/3, e3§§ 

Essentially this tells us that to compute the effect of 62 on ei and then compute the effect 
of 63 on the result is the same as computing the effect of 63 on 62 and then computing the 
effect of that result on ei. Ignoring the details for a moment, suppose that ei = (ti, ni) :: e'^ 
and we are able to apply the rule (m6) to both terms (that is twice to A and once to B). 
Then the term portion of the environment term for A is roughly 

llti,ol2,nl2, 62!, ol3,nl3, esj 

and for B it is roughly 

[ti,o/2 + {0I3 -nl2),nl3 + (n^2 -0/3), -g;e2,n/2, 0/3, 63^1 



Then we can apply (ml) to bring these two terms back together. This is the heart of the 
proof. The vast majority of the proof is taken up by details and corner cases. For instance. 
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we might not be able to apply (m6) to a term [(ti,ni) :: ei, nZi, 0/2, 62! because nli > ni, 
62 is nil, or 62 isn't of the form {t2,n2) :: 62- All of these cases must be handled and often 
doubly so since we have to apply (m6) twice to the term A. In order to ease these pains, 
we introduce a few lemmas which we can use to massage expressions into the proper form. 

The rules (m4), (m5), and (m6) require their second environment to be of the form 
(t, n) :: e which isn't the case with merged environments like in the term B. Ideally we 
could use the rules (m5) and (m6) to turn a term of the form §e, n/,oZ,e'§ into one of 
the form (t, n) :: e, but if we do this we will not be able to use our inductive hypothesis. 
To accommodate this issue we introduce the following lemmas which essentially state that 
applying the rules (m5) and (m6) does not interfere with the process of rewriting. 

Lemma 3.8.1. Let A be the environment ^ei,nli, oh, ^62, nl2, 01^,63^^ where e^ is a sim- 
ple environment and 62 is of the form (t2) ^^2) - 62- Further, for any positive number i such 
that i < n/2 — n2 and i < 0I3, let B be the environment 

lei,nli,oli,fe2,nl2 -i,ok -i,e3{i}}}. 

If A[>*^C for any simple environment C then also Bt>*^^C. 

Proof. It suffices to verify the claim when i = 1; an easy induction on i then extends the 
result to the cases where i > 1. For the case of i = 1, the argument is by induction on the 
length of the reduction sequence from AtoC with the essential part being a consideration 
of the first rule used. The details are straightforward and hence omitted. □ 

Lemma 3.8.2. Let A be the environment -^^ei, n/i, 0/1, §62, n/2, 0^3, es^-^ where 62 and 63 
are environments of the form {t2,nl2) :: e'2 and (ts^us) :: e'^, respectively. Further, let B be 
the environment 

f ei , n/i , o/i , (1*2 , 0/3 , na , 63] , na + (71/2 - ok)) :: {62, n/2 , 0/3 , 63 § }} . 
If Al>*j^C for any simple environment C then also B\>*^C. 

Proof. The proof is again by induction on the length of the reduction sequence from A 
to C. The first rule in this sequence either produces B, in which case the lemma follows 
immediately, or it can be used on B (perhaps at more than one place) to produce a form 
that is amenable to the application of the induction hypothesis. □ 

In evaluating the composition of 62 and 63, it may be the case that some part of 63 
is inconsequential. The last observation that we need is that this part can be "pruned" 
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immediately in calculating the composition of the combination of ei and 62 with 63. The 
following lemma is consequential in establishing this fact. 

Lemma 3.8.3. Let A be the environment §ei, n/i, 0Z2, 62 J where 62 is a simple environ- 
ment. 

1. If 0I2 < nil — lev{ei) then A reduces to any simple environment that ei reduces to. 

2. For any positive number i such that i < nli — lev{ei) and i < 0I2, A reduces to any 
simple environment that §ei,nZi — i,ol2 — i,e2{i}^ reduces to. 

Proof. Let ei be reducible to the simple environment e'^. Then we may transform A to 
the form -^e'^, nli, 0I2, 62 §. Recalling that the level of an environment is never increased by 
rewriting, we have that lev{ei) < lev{ei). From this it follows that A can be rewritten to 
e'^ using rules (m5) and (m2) if 0/2 < nli — lev{ei). This establishes the first part of the 
lemma. 

The second part is nontrivial only if nli — lev{ei) and 0I2 are both nonzero. Suppose 
this to be the case and let B be ^ei,nli — 1,0/2 — l)e2{l}^- The desired result follows by 
an induction on i if we can show that A can be rewritten to any simple environment that 
B reduces to. We do this by an induction on the length of the reduction sequence from B 
to the simple environment. This sequence must evidently be of length at least one. If a 
proper subpart of B is rewritten by the first rule in this sequence, then the same rule can be 
applied to A as well and the induction hypothesis easily yields the desired conclusion. If B 
is rewritten by one of the rules (m3)-(m6), then it must be the case that A\>rmB via either 
rule (m4) or (m5) from which the claim follows immediately. Finally, if B is rewritten using 
rule (m2), then 0I2 < nli — lev{ei). The second part of the lemma is now a consequence of 
the first part. □ 

We now prove the associativity property for environment composition: 

Lemma 3.8.4. Let A and B be environments of the form 

ffei,nli,ol2,e2},nl2 + (n/i - 0/2), 0/3, 63^ 

and 

{{ei,n/i,o/2 + (0/3 -n/2), {{62,^/2, 0/3, 63 
respectively. Then there is a simple environment C such that At>*^C and B\>*j^C. 
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Proof. We assume that ei, 62 and 63 are simple environments; if this is not the case at the 
outset, then we may rewrite them to such a form in both A and B before commencing the 
proof we provide. Our argument is now based on an induction on the structure of 63 with 
possibly further inductions on the structures of 62 and ei. 

Base case for first induction. When 63 is nil, the lemma is seen to be true by observing 
that both A and B rewrite to §ei, n/i, 0/2, 62 J by virtue of rule (m2). 

Inductive step for first induction. Let 63 = {t^jU^) :: 63. We now proceed by an induction 
on the structure of 62- 

Base case for second induction. When 62 is nil, it can be seen that, by virtue of rules 
(m2), (m3) and either (m4) or (m5), A and B reduce to ^ei,nli, 0I3 — nl2,e3{nl2}^ when 
0Z3 > nl2 and to ei otherwise. The truth of the lemma follows immediately from this. 

Inductive step for second induction. Let 62 = (t2,n2) ■■ 63. We consider first the situation 
where nli > lev{ei). Suppose further that 0I3 < (n/2 — ^^2)- Using rules (m5) and (m2), we 
see then that 

B\>*^iei,nli,ol2,e2}. 

We also note that 0Z3 < {nl2 + {nli — 0I2)) — lev{^ei,nli,ol2,e2^) in this case. Lemma 3.8.3 
assures us now that A can be rewritten to any simple environment that -^^ei, n/i, 0/2, 62^ 
reduces to and thereby verifies the lemma in this case. 

It is possible, of course, that ol^ > (n/2 — n.2). Here we see that 

B\>*.„^fei,nli - 1,0/2 + (0/3 - n/2) - 1, Se'2, 712,0/3 - (n/2 - n2),e3{n/2 - ?^2}S}}• 
using rules (m5) and (m6). Using rule (m5), we also have that 

^l>rmSSei,n/i - 1,0/2 - l,e'2}},n/2 + (nil - 0/2), 0/3, 63^. 
Invoking the induction hypothesis, it follows that A and 

{{ei,n/i - 1,0/2 + {oh-nh) - 1, -g^ 63, n/2, 0/3, 63}}}} 

reduce to a common simple environment. By Lemma 3.8.1 it follows that B must also 
reduce to this environment. 

The only remaining situation to consider, then, is that when n/i = lev{ei). For this 
case we need the last induction, that on the structure of ei. 

Base case for final induction. If ei is nil, then nli must be 0. It follows easily that both A 
and B reduce to §62, n/2, 0/3, 63 J and that the lemma must therefore be true. 
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Inductive step for final induction. Here ei must be of the form (ti,nZi) :: e[. We dispense 
first with the situation where n2 < nl2. In this case, by rule (m5) 

B\>*^iei,nli,ol2 + {ok -nl2),ie2,nl2 - 1,0/3 - 1,63^^- 
By the induction hypothesis used relative to e'^, B and the expression 
iiei,nli,ol2,e2},nl2 + (n/i -0/2) - 1,0/3 - 1,63}} 

must reduce to a common simple environment. By Lemma 3.8.3, A must also reduce to 
this environment. 

Thus, it only remains for us to consider the situation in which n2 = n/2. In this case by 
using rule (ml) twice we may transform A to the expression A^ :: At where 

Ah = ([pi, 0/2,^2, 62], 0/3,713,631,713 + ((n/2 + {nil -0/2)) -0/3)) 

and 

At = iie[,nli,ol2,e2},nl2 + (ri/i - 0/2), 0/3, 63}}. 
Similarly, B may be rewritten to the expression B^ :: Bt where 

Bh =(Pi, 0/2 + (0/3 - n/2), '"3 + (n/2 - 0/3), 

(1*2,0/3,713,631,713 + (71/2-0/3)) :: §62, 71/2, 0/3, 63^1, 
773 + (77/2 - 0/3) + (77/1 - (0/2 + (0/3 - 77/2)))) 

and 

Bt = fe[,nli,ol2 + (ok-nk), ([^2, 0/3, 773, 631, 773 + {nh-oh)) :: §62,77/2,0/3, e3§§. 

Now, using straightforward arithmetic identities, it can be seen that the "index" components 
of Ah and B^, are equal. Further, the term component of A^ can be rewritten to a form 
identical to the term component of B^ by using the rules (ml) and (m6). Finally, by virtue 
of the induction hypothesis, it follows that At and the expression 

f6i, 77/1, 0/2 + (0/3 -77/2), §62, 77/2, 0/3, e3§§ 

reduce to a common simple environment. Lemma 3.8.2 allows us to conclude that Bt can 
also be rewritten to this expression. Putting all these observations together it is seen that 
A and B can be reduced to a common simple environment in this case as well. □ 
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3.8.2 Proof of Confluence for Reading and Merging Rules 

We are now in a position to prove the confluence of the reading and merging rules, using 
the ideas outlined in the beginning of this section. 

Lemma 3.8.5. The relation [>rm is weakly confluent. 

Proof. We recall the method of proof from [HueSO] . An expression t constitutes a nontrivial 
overlap of the rules i?i and R2 at a subexpression s if (a) t is an instance of the lefthand 
side of Ri, (b) s is an instance of the lefthand side of R2 and also does not occur within 
the instantiation of a variable on the lefthand side of Ri when this is matched with t and 
(c) either s is distinct from t or Ri is distinct from R2. Let ri be the expression that 
results from rewriting t using Ri and let r2 result from t by rewriting s using R2 . Then the 
pair (ri,r2) is called the conflict pair corresponding to the overlap in question. Relative to 
these notions, the theorem can be proved by establishing the following simpler property: 
for every conflict pair corresponding to the reading and merging rules, it is the case that 
the two terms can be rewritten to a common form using these rules. 

In completing this line of argument, the nontrivial overlaps that we have to consider are 
those between (ml) and each of the rules (rl)-(r6), between (ml) and itself and between 
(m2) and (m3). The last of these cases is easily dealt with: the two expressions constituting 
the conflict pair are identical, both being nil. The overlap between (ml) and itself occurs 
over a term of the form 0/1, n/i, ei], 0/2, ?t-^2) 62!) o^3) ?^^3> esl- By using rule (ml) once 
more on each of the terms in the conflict pair, these can be rewritten to expressions of the 
form lt,ol' ,nl' ,e'} and [t, o/", nZ", e"], respectively, whence we can see that ol' = ol" and 
nl' = nl" by simple arithmetic reasoning and that e' and e" reduce to a common form using 
Lemma 3.8.4. The overlaps between (ml) and the reading rules are also easily dealt with. 
For instance, consider the case of (ml) and (r2) where we have 

0, nli,nilj,ol2,nl2, 62] 
Rewriting with (r2) first produces 

[#(i + n/i), 0/2,^^2, 62! 
while rewriting with (ml) first yields 

l#h oh - n/i, n/2 + {nil - 0I2), ^nil, nli.oh, 62 J]. 
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For both expressions we can rewrite 62 to a simple environment eg using Lemma 3.3.1. Now 
if 0I2 > nil then both terms can be reconciled to 

oh - nli,nl2, e2{n/i}] 

In the case of 0I2 < nli, both terms rewrite to ^{i + nli — 0I2). Thus the conflict pair is 
resolved. The other cases of overlaps between (ml) and the reading rules are similar and 
require roughly the same reasoning. □ 

As observed already, the main result of this section follows directly from Lemma 3.8.5 
and Theorem 3.7.1. 

Theorem 3.8.1. The relation t>rm is confluent. 

The uniqueness of c>rm-normal forms is an immediate consequence of Theorem 3.8.1. In 
the sequel, a notation for referring to such forms will be useful. 

Definition 3.8.1 (Reading and merging normal form). The notation \t\ denotes the >rm- 
normal form of a suspension expression t. 

It is easily seen that the o^^-normal form for a term that does not contain meta variables 
is a a term that is devoid of suspensions, i.e., a de Bruijn term. A further observation is 
that if the all the environments appearing in the original term are simple, then just the 
reading rules suffice in reducing it to the de Bruijn term that is its unique [>rm-^ormal form. 

3.9 Simulation of Beta Reduction 

A fundamental property of all explicit substitution calculi is that they properly simulate the 
lambda calculus. In particular, we must ensure that our (3s rule corresponds to the P rule 
of the lambda calculus, modulo the reading and merging rules. The following two theorems 
establish this result. The first shows that a Ps rewrite on suspension terms corresponds to 
some number of /3 rewrites in the lambda calculus. One might think of this theorem as 
proving the soundness of our calculus. The second theorem shows that any /3 rewrite can 
be simulated using the rules of our calculus. One might think of this theorem as proving 
the completeness of our calculus. 

Theorem 3.9.1. Ifxi andx2 are suspension expressions such that xi>p^X2 then |xi|l>5|2;2|. 
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Proof. This result is proven for the original suspension calculus in [NW98]. We know from 
Theorem 3.4.1 that c>,.m-normal forms of xi and X2 are the same as in the original suspension 
calculus, thus we can carry over the previous result. □ 

Theorem 3.9.2. If xi and X2 are suspension expressions in ^rm-normal form such that 
xi \>(i X2 then xi [>*„,^^X2. 

Proof. A stronger version of this property, where the result is replaced with 2;il>*^^X2, is 
proved as Lemma 8.2 of the original suspension paper. Since the and the reading rules 
are essentially the same for the two calculi, the result carries over. □ 

3.10 Confluence of OveraU Calculus 

In this section we prove that the full suspension calculus is confluent even in the presence 
of graftable meta variables. One important distinction between the proof of this property 
and the proof of confluence for the reading and merging rules is that the latter is proven 
in spite of the merging rules while the former will be proven only because of the merging 
rules (see Section 3.6 for a discussion of why the merging rules are required). Thus this 
property speaks to the well designed nature of the merging rules. Moreover, this property 
makes the suspension calculus a candidate for unification procedures designed specifically for 
graftable meta variables [DHK95] because with this confluence property we are guaranteed 
that [>rm/3s -normal forms are unique even with graftable meta variables. 

For most explicit substitution calculi, confluence of the full calculus is proven using 
Hardin's Interpretation Method [Har89]. The interpretation method starts with the lambda 
calculus which is already confluent and it uses the confluence and strong termination of the 
substitution rules (in our case, the reading and merging rules) to close a confluence diagram 
for the overall calculus. This method is inadequate, however, when we allow for graftable 
meta variables. The problem is that the lambda calculus with graftable meta variables does 
not make sense and is not confluent. Instead, we follow the method presented in [CHL96] 
which is based the on the following key lemma. 

Lemma 3.10.1. Let TZ and S he two relations defined on the same set X, TZ being conflu- 
ent and strongly normalizing, and S being strongly confluent, i.e. such that the following 
diagrams hold for any f,g,h € X : 
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t t 



e — > e 



ti t'l t2 t'2 

ti t2 — > t'l t'2 



t^t' e^e' 
it,l) :: {t',l) :: e' 



t^t' 
\t Xt' 



ei e'l 62 e'2 



|ei, 71/1,0^2,62}} le'i,nli,ol2,e'2} 



t^t' e^e' 
\t,ol,nl,e\ —f {t' , ol , nl , e'l 



t^t' 



{t,n)^{t',n) 



ti — > t'l t2 t'2 



(Ail) t2^lt'i,l,0,{t'2,0)::nil 



Figure 3.4: Parallel /3s-reduction 





Then the relation TZ*S1Z* is confluent. 

We will apply the lemma using the reading and merging rules as TZ and parallel /J^- 
r eduction as S. 

Definition 3.10.1 (Parallel /3s-reduction) . Parallel (3s-reduction is defined by the rules in 
Figure 3.4 and is denoted by \>f3^\\. 



Lemma 3.10.2. [>rm cmd >i3^\\ satisfy the condition of Lemma 3.10.1. 

Proof. 1 1 is obviously strongly confluent since C>^^ is is a left linear system with no critical 
pairs. This proves that the first diagram in Lemma 3.10.1 holds. 

For the second diagram, the interesting case is the critical pair for / = [(A ti) t2,ol,nl,e\. 
Li this case, we have g = [[i^ 1,0, (^25 0) - niZ], oZ, nl, e'] and h = [A ti, o/, nl, e] [i2, o/, nl, e]. 
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where ti \>i3^\\t[, t2 >i3s\\ ^2' ^^'^ ^^I3s\\ ^' ■ must find a k such that g\>*mk and h\>*^h' \>f3^\\ 
h"\>*.^k. This is straightforward since 

5= [lt'i,l,0, (4,0) ■.■.nilj,ol,nl,e'j 
>n,lt[,ol + l,nl,i{t'2,0) :: nil, 0, o/, 
[>*„|ti,o/ + l,n/, (p2,o/,n/,e'],n/) :: e'] 

and 

h = \\ti,ol,nl,e\ |i2, oZ,nZ,e] 
>A lh,ol + l,n/ + 1, + 1) :: [t2,o/,n/,e] 

[[i'l, + 1, n/ + 1, (#1, nl + 1) :: e'], 1, 0, ([4, o/, n/, e'], 0) :: ni/] 
[>„It;,o/+ l,n/,{{(#l,n/ + 1) :: e', + 1, 1, ([4, o/, n/, e'], 0) :: mZ}}l 
>rmPi>o/ + l,n/,(|4,o/,n/,e'],n/) :: e'l 

□ 

Theorem 3.10.1. The relation l>rm/3s is confluent. 

Proof. Note that l>rm/3, ^ 7^*57^* C o;!^^^. □ 
3.11 Similarity in the Suspension Calculus 

The purpose of this section is to introduce a notion of similarity in the suspension calculus 
which relates suspension expressions that differ only in the renumbering and indices of en- 
vironment terms. This allows us to formally capture the notion that two environments are 
similar enough that they act the same during rewriting which will be useful when we trans- 
late from another explicit substitution calculus into the suspension calculus (Section 4.2.2). 
The notion of similarity stems from the fact that there are two ways to represent the renum- 
bering to be done on an environment term. One is using the difference between the new 
embedding level of the suspension and the embedding level of the environment term. The 
other is with an explicit renumbering substitution applied to the term in the environment 
term. This section proves that these two notions are equivalent for the purpose of finding 
normal forms. 

Definition 3.11.1. The similarity relation ~ is defined in Figure 3.5. 
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t ~ t 



e ~ e 



ti ~ t'l t2 t'2 
tl t2 ^ t'l t'2 



t t' e ^ e' 
(t, n) :: e ~ {t' , n) :: e' 



t ^ t' 
\t ~ At' 



ei ~ e'l 62 ~ e'2 
§ei, 71/1,0/2,62}} ~ le'i,nli,ol2,e'2} 



t ~ t' 



[t,o/,n/,e] ~ ft' ,ol,nl,e'} 



t ~ t' 



(t,n) ~ it',n) 



t ~ t' 



e ~ e 



(|t,o/,n/,r],n/ + /c) :: e ~ ([t', o/, n/', r'], n/' + k) :: e' 



Figure 3.5: The similarity relation, 



The main result of this section is to prove that similar terms rewrite to the same normal 
form. This first requires proving the following lemma. 

Lemma 3.11.1. Let -^ei, n/i, 0/2, e2§C>*„e where e is a simple environment. 

• If nil ~ lev{ei) > k and 0I2 > k, then ^ei,nli — k, 0/2 — k, e2{k}^\>*j^e. 

• If nil — lev{ei) > 0I2, then ei>*„e. 

Proof. The proof is by induction on the length of the sequence §ei, n/i, 0/2, e2§C>*me. □ 

Theorem 3.11.1. If t ^ t' for terms t and t' then they rewrite by reading and merging 
rules to the same de Bruijn term. If e ^ e' for environments e and e' then they rewrite by 
reading and merging rules to similar simple environments. 

Proof. We prove the general case of exp ~ exp' for suspension expressions exp and exp' . 
We do this by induction using the relation ^ defined in Definition 3.7.5. Note that this 
relation decreases when an expression is rewritten using the reading and merging rules and 
also a subpart is always smaller than the original expression. 

Because we are inducting using we can assume that the result already holds for any 
similar subparts of exp and exp' . Then we can rewrite these similar subparts to be equal in 
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the case of terms or simple environments in the case of environments. This decreases the 
measure of the overah terms exp and exp' and thus the inductive hypothesis then appHes 
to them. By this reasoning, we can assume that whenever two subparts of exp and exp' are 
similar they are in fact equal in the case of terms and simple in the case of environments. 
We will also use the convention that x and x' are always similar. 

Let us consider cases based on the structure of exp and exp', first looking at the case 
when both are terms. If they are both constants or de Bruijn indices then the result is 
trivial. If exp = {ti 12) then exp' = {t'^ t'2) and ti = t'l and t2 = t'2 so the result follows 
trivially. A similar result holds in the case where exp and exp' are lambda abstractions. 

The first nontrivial case is when exp and exp' are both suspensions, say exp = [t, ol, nl, e] 
and exp' = [t, o/, n/, e']. Now consider which rewrite rules apply to the toplevel of these 
terms, keeping in mind that t is in normal form. If t is an application or an abstraction then 
(r5) or (r6) applies and the result follows from the inductive hypothesis. If t is a de Bruijn 
term and (r2) or (r4) applies then the result again follows from the inductive hypothesis. 
If (r3) applies and e and e' have the same head then the result is trivial. The key case is 
when (r3) applies and e and e' have different heads, in which case we have, 

exp = ol, nl, {ftr, olr,nlr,rj,nlr + k) :: ei] 
t>(r3) ll^r, olr,nlr, r], 0, nl — {nl^ + A;), nil\ 
f>(m.i) [^r, olr, nl — {nlr + k) + nlr, f r, nlr,0, niljj 

>{m2) ltr,0lr,nl - k,rj 

exp' = ol, nl, {{tr, olr,nl'^., r'},nl'j. + A;) :: 
>(r3) llU, olr, nl'j.,r'j,0,nl - {nl'^ + k),mlj 
(>(mi) {tr, olr, nl — {nl'^ + k) + nl'^, §r', nl'^,0, niljj 

>[m2) ltr,0lr,nl - k,r'j 

The two resulting suspensions are similar and smaller than their original terms, thus the 
inductive hypothesis finishes this case. 

The other half of the proof is to show that when exp and exp' are similar environments 
then they rewrite to similar simple environments. The cases when exp and exp' are either 
nil or a cons follow trivially from the inductive hypothesis. The important case is when 
exp = §ei, n/i, 0Z2, and exp' = ^e'i,nli,ol2,e'2^. Consider the cases for which rewrites 
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can apply to the toplevel of both expressions. If (m2), (m3), or (m4) apphes to the first 
expression then the same rewrite apphes to the second environment and the result follows 
easily. The case when (m5) applies to both is also direct using the inductive hypothesis. 
The two remaining cases are the most interesting: when (m6) applies to both, and when 
(m5) applies to one and (m6) to the other. 

Consider when (m6) applies to both expressions. Here the head terms of ei and e'^ must 
be the same. If the head terms of 62 and e'2 are also the same then the result is trivial. 
Otherwise we have, 

exp = f (ti,nZi) :: e3,nli,ol2, (lU, olr,nlr,rj,nlr + k) :: 64 J 

^(m6) ilti,ol2,nlr + k, ([tr, olr,nlr, rj,nlr + k) W 64], n/r + A; + {nil - oh)) 
:: 1^63, n/i, 0/2, ([tr, o^r, n^r, r], n/,. + k.) :: e^} 

exp' = §(ti,nZi) :: 63,71/1,0/2, ([t^, o/^, n/^, r'], n/^ + k) :: 64^ 

l>(m.6) {\h-,ol2,nl'^ + k, (14, olr,nl'^,r'j,nl'^ + k) :: 64],?!/^ + k + {nli - 0/2)) 
:: ■8^63,71-/1,0/2, {ltr,olr,nl'^,r'j,nl'j. + k) :: 64^ 

These two environments are still similar and the inductive hypothesis now applies. 

The final case is when (m5) applies to one expression and (m6) to the other. Without 
loss of generality, assume that (m6) applies to exp and (m5) to exp'. There are two subcases 
based on whether the heads of 62 and e'2 are the same or not. Here we will only consider 
hardest case where the heads differ. The other case is a simplification of the following 
argument, 

exp = f{lts,ols,nls,sj,nls + kg) :: 63,71/1,0/2, (|ir, o/r, 71/^, r], 71/^ + kr) :: 64J 

{llts,ols,nls,sj,ol2,nlr + /c,. , 62I , Ti/,. + kr + (nil - 0I2)) :: •§^63, 71/1,0/2, 62 § 
t>(ml) {lts,ols + (0/2 - nls),nlr + K + {nig - 0/2), -g^s, 71/3,0/2,62}}], 
nlr + kr + {nil - 0I2)) :: §63, 7i/i, 0/2, 62^ 

exp = §([t'3,o/s,7i/l,,s'],7i/^ + ks) ■■: 63, 7i/i, 0/2, 62^ 
Consider first the case when 77/1 — (nl'^ + kg) > 0/2. Since (m6) applies to the first expression 
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we know that nli = nig + kg and thus nig — nl'g > 0I2 . The first expression then rewrites to 

{lts,ols,nlr + kr + nls - ol2,fs,nls,ol2,e2jj, 

nlr + kr + nil - 0I2)) ■■ §63,71/1,0/2,62}} 

The second expression rewrites using (m5) multiple times to 

It is easily seen that exp ^ ^s,nls,ol2,e2^ and exp' » -^s', n/s, 0/2, e2§. Moreover, 
f s, nls,ol2, 62^ and f s', nls, 0I2, 62^ are similar so the inductive hypothesis applies and tells 
us these merged environments rewrite to similar simple environments. Since nls — 0I2 > 
nl'g > lev{s'), applying Lemma 3.11.1 yields that §s, ^^/s, 0Z2, 62 J and s' also rewrite to 
similar simple environments. By applying these rewrites, the heads of our two environ- 
ments are now similar. By the inductive hypothesis we also know that -8^63, n/i, 0/2, e2S' 
and -^63, n/i, 0/2, ^2 J rewrite to similar simple environments. Since nli — lev{e'^) > 0I2 we 
can apply Lemma 3.11.1 to know that §63, n/i, 0/2, 62 J and 63 rewrite to similar simple 
environments, thus finishing this case. 

The other case is when nli — {nl'g + ks) < 0/2- Then we can apply (m5) multiple times 
to exp' and then eventually (m6), 

exp' = lilt's, ols,nl's,s'j,nl's + ks) :: 63, n/i, 0/2, e'sS 

f>(m5) Ult's^ols,nl'^,s'j,nl'^ + ks) :: e'^,nl's + kg, oh - nls + nl'^, 

e'2{nls - nl's}} 
l>(m.6) {Ws:Ols,nl's,s'j,ol2 - nls + n/^ , /, 62 {n/^ - n/^}l, 
1 + (ks - {0I2 - nls))) ■■ 

§63, n/^ + ks, 0I2 - nls + nl's, ^{'^^s - nl's}^ 

Where / = lev{e'2{nls — nl'^}). Let us focus first on the tail portions of our two en- 
vironments. By the inductive hypothesis -§^63, n/i, 0/2, e2§ and §63,71/1,0/2,62^ rewrite 
to similar simple environments. Then by applying Lemma 3.11.1, §63, n/i, 0/2, e2§ and 
§63, 7i/^ -|- fcs, 0/2 — nls + nl's, 62{^'i ~ ^'s}^ rewrite to similar simple environments. 

Finally we focus on the heads of our environments. The head for exp' can now be 
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rewritten using (ml), 

{{t'g, ols, nl'g, s'\,ol2 - nls + nl'g, I, Cjln/s - nl'^}} 

>{mi) [4i oh + oh - nls, I, is', nl's, 0I2 - nls + nl's, Cgln/^ - nl's}}} 

Note that as before exp ^ -^s, nls, 0I2, 62 J and exp' ^ -^s', nls,ol2, 62^, also ^s, nig, 0I2, 
and ^s' , nls,ol2, are similar, so by the inductive hypothesis these merged environments 
rewrite to similar simple environments. Since nls — lev{s') > nlg — nl'^ and 0I2 > nls—nl'^, ap- 
plying Lemma 3.11.1 yields that §s, nls,ol2, 62 J and -^s', nl'^, 0I2 — nls + nl'^, e'2{nls — nl's)^ 
rewrite to similar simple environments. Using this rewriting results in the heads being sim- 
ilar. □ 



Chapter 4 



Comparison with Other ExpUcit Substitution CalcuU 

There are many exphcit substitution calcuh that offer alternative treatments of substitu- 
tions which leads to varying computational properties. In this chapter we outline three 
computational properties that are of particular interest and use these to categorize various 
calculi based on how well they capture these. The three properties we focus on are (1) com- 
bination of substitution walks, (2) confluence in the presence of graftable meta variables, 
and (3) preservation of strong normalization. 

Combination of substitution walks, also called merging, can be traced back to de Bruijn 
[dB72]. The substitution operation on de Bruijn terms, see Definition 2.2.2, is denoted by 
S{t; si, S2, ■ ■ .) and represents the term t where Si is substituted for the the i*'* de Bruijn in- 
dex. De Bruijn establishes the meta-property S{S{t; si, S2, . . .); n, r2, . . .) = S(t, ui, U2, ■ ■ ■) 
where Uj = S{si,ri,r2, ■ ■ ■)■ Here two substitutions walks over t are merged into a single 
substitution walk over t. For a more concrete example, consider the term ((AAti) t2 is). 
A naive reduction of this term would require two walks over the structure of ti: the first 
for t2 and the second for ^3. Moreover, the second walk would also have to walk over the 
structure of ^2 each place where it is substituted into ti. A more reasonable approach is to 
merge the two substitution prior to making a walk over the structure of ti. For instance, 
in the suspension calculus we can rewrite the term to |ti,2,0, (t2,0) :: (t3,0) :: nil} which 
requires only one walk over ti and avoids any walks over t2 since the non-overlapping nature 
of the two substitutions is detected by the merging process. In practice, this property has 
proven to have a great impact on efficiency [LNQ04]. 

Confluence in the presence of graftable meta variables, see Section 3.6, requires a cal- 
culus with rules interactions between substitutions. To see why, consider the example from 
Section 3.6 in a named context where we have the term {{Xa.{{Xb.X) ti)) ^2) with X a 
graftable meta variable. Depending on which redex is contracted first, this term can reduce 
to either X{ti/b){t2/a) or X {t2/a){ti{t2/a)/b) where {t/x) is an explicit representation of 
substitution. In order to reconcile these two terms, interaction rules for substitutions must 
be added to the calculus. These interaction rules either take the form of combination rules, 
as seen in the previous paragraph, or permutation rules. 
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Preservation of strong normalization (PSN) means that lambda terms which are strongly 
normalizing in the lambda calculus remain strongly normalizing in the explicit substitu- 
tion calculus, i.e., an infinite reduction path is never added to a term with only finite 
reduction paths. To see why PSN might fail, consider the above problem of confluence 
and how we might resolve it by adding a permutation rule of the form X{ti/b){t2/a) — > 
X{t2/a){ti{t2/a)/b). This rule flxes the confluence problem, but now the system fails to 
preserve strong normalization since the new rule can be repeatedly applied to itself to per- 
mute the substitutions back and forth. Preservation of strong normalization is desirable 
because one often works in a representational setting with typed lambda terms which are 
strongly normalizing and if PSN holds then all of those terms remain strongly normalizing 
in the explicit substitution calculus. On the other hand, if PSN does not hold then one 
must be careful in selecting a reduction strategy which avoids the newly introduced infl- 
nite reduction paths. Preservation of strong normalization is studied in further depth in 
Chapter 5. 

Another part of our survey of explicit substitution calculi consists of translations be- 
tween the other popular calculi and the suspension calculus towards understanding and 
contrasting their relative capabilities. To give substance to our translations, we established 
relevant properties of the translations such as their correctness and their ability to preserve 
important computational properties of the calculi they relate. The first half of showing 
correctness is that well- formed terms are translated to well- formed terms. The second half 
is that normal forms, with respect to substitutions, are preserved by the translation. To 
show that important properties are preserved, we will argue that the translations are infor- 
mation preserving, which is an intuitive, rather than formal notion. There are various ways 
in which we can capture this notion with the most desirable being to show that if t rewrites 
to r in one step then given the translation T, T{t) rewrites to T(r) in at least one step. 
We call this property simulation because it shows the translation preserves the information 
needed to simulate the substitution process of one calculus using the substitution process of 
another. This is not always possible due to the idiosyncrasies of different calculi. In these 
cases we will look we will find other ways of arguing for information preservation, while 
also looking at why simulation fails since the reason often reveals key differences between 
calculi. Finally, we note that for all of our translations we assume we are in a context 
without gr affable meta variables. 

We begin by separating the calculi based on combination of substitution walks, since 
this property is evident and has the greatest effect on the syntax of the language. 
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4.1 Calculi Without Merging 

In this section we look at three calcuh without merging: Af-calculus [BBLRD96], Xs- 
calculus [KR95], and Asg-calcuhis [KR97]. These calcuU lack the syntax for merging substi- 
tutions and instead each substitution in these calculi represents a substitution for at most 
one de Bruijn index and then possible renumberings for other de Bruijn indices. Since the 
notion of substitution in the suspension calculus is more general, we can only discuss a 
translation from these calculi to the suspension calculus and not the other way around. 
Nevertheless, seeing how the substitution concepts in these calculi are reflected in the sus- 
pension calculus gives us greater insight into their key characteristics. 

4.1.1 The Af -calculus 

The Ai;-calculus is actually a simplification of the Au-calculus, a calculus we will see more of 
in Section 4.2. The Au-calculus was created by removing the syntax for merging of substi- 
tutions available in the Au-calculus and then modifying the rewriting rules to accommodate 
the new syntax. This simplified system was proven to preserve strong normalization, but 
at the cost of confluence in the presence of graftable meta variables. Another cost of the 
simplification is a peculiarity in the rewriting rules that make the system undesirable from 
an implementation perspective. We develop these issues in this section, starting first with 
the syntax of the calculus. 

Definition 4.1.1. The syntax of \v- expressions is given by the following definitions of 
terms, denoted a and b, and substitutions, denoted s. 

a ::= n\ a b \ \a \ a[s\ 

s ::= a/ \ ^{s) \ T 

The term n represents the n*'* de Bruijn index, and a[s] is called a closure. The substi- 
tution a/ is called slash and represents the substitution of a for the first de Bruijn index and 
a shifting down of all other de Bruijn indices. The substitution i\{s) is called lift and is used 
to push substitutions underneath lambda abstractions. The last substitution | is called 
shift and represents increasing all free de Bruijn indices by one. Many of these concepts are 
less generalized versions of what is available in the suspension calculus, a point we make 
more explicit with the following translation. 

Definition 4.1.2. The translation T from Xv-terms to suspension terms and the translation 
E from Xv -substitutions to triples of an old embedding level, a new embedding level, and a 
suspension environment are defined simultaneously by recursion as follows: 



4.1. CALCULI WITHOUT MERGING 



55 



(B) 



(A a) b 



a[b/] 



(VarShift) 



n[T] ^ n+l 



(App) 
(Lambda) 



(a b)[s] a[s] b[s] 

l[a/] a 

n + l \a/] — > n 



(FVarLift) 
(RVarLift) 



(FVar) 
(RVar) 



Figure 4.1: Rewrite rules for the Au-calculus 



1. For a term t, T(t) is ij^n if t is n, {T{a) T(b)) if t is (o b), XT(a) if t is \a, and 
lT{a),ol,nl,el if t is a[s] where {ol,nl,e) = E{s). 

2. For a substitution s, E{s) is (1, 0, (T(a), 0) :: nil) if s is a/, (0,1, nz/) if s is ], and 
{ol + l,nl + 1, (#1, nl + 1) :: e) if s is fr(s') where {ol, nl, e) = E{s'). 

Theorem 4.1.1. For every Xv-term a, T{a) is a well-formed suspension term. 

Proof. The proof is by induction using the dual property that for every Ai;-substitution 
s such that {ol,nl,e) = E{s) we have ol = len{e), nl > lev{e), and e is a well-formed 
suspension environment. □ 

The rules of the Au-calculus are presented in Figure 4.1. We define the v rules to be 
all the rules of the Af-calculus except (B). Because there is no possibility for merging 
substitutions, the v rules simply push substitutions down in the tree and then evaluate 
them once they are applied to de Bruijn indices. Thus most of the v rules can be matched 
up with a corresponding reading rule from the suspension calculus, with the exception 
being the rule RVarLift. The fundamental problem with this rule is that it replaces a single 
substitution on the left with two substitutions on the right. From the suspension calculus 
point of view, this is a step backwards. Thus we instead prove the following theorem in 
which a[>y b implies T(a) and T{b) rewrite to a common term rather than a stronger one 
in which T{a)\>;^T{b). 

Theorem 4.1.2. Let a and b be Xv-terms such that a [>y b. Then there exists a suspension 
term t such that T{a)\>*^t and T{b)\>*^t. 

Proof. The proof is by case analysis on the rule used to transition from a to b. In every case 
but RVarLift we can actually prove that T {a)t>*.j^^T {b) . The most difficult of these cases is 
FVar for which we must show [#1, 1,0, (T(a),0) :: m/][>*,„T(a). To do this, we first apply 
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(r3) to generate |T(a), 0, 0, ni/] . Then we prove by induction the general property that 
|t, 0, 0, nilj[>*,^t in a setting without graftable meta variables. 

In the case of RVarLift, suppose that {ol,nl,e) = E{s). Then we can show that the 
terms + 1), o/ + 1, n/ + 1, (^^1, n/ + 1) :: e] and |[#n, o/, n/, e], 0, 1, ni/] have a com- 

mon reduct in the term {^n, ol, n/ + 1, e]. □ 

Based on this theorem, the translation T preserves de Bruijn normal forms. To show 
that T is information preserving we offer the following theorem which shows that T is 
one-to-one. 

Theorem 4.1.3. The translation T is one-to-one. 

Proof. The proof is by induction using the dual property that E is one-to-one. □ 

Looking again at the RVarLift rule, we can see a problem from the implementation 
perspective. Consider the term 4['(|(ff(f|~(a/)))] which rewrites to a[T][T][T]- Here three 
separate renumbering passes are generated in order to increase all free de Bruijn indices by 
three. The problem is that not only is combination of substitutions not allowed, but the 
syntax of the At;-calculus is not rich enough to encode a renumbering of de Bruijn indices 
by anything but one. In the next section we will see another calculus without merging, but 
with a more general notion of substitution which avoids this problem. 

4.1.2 The As-calculus 

The As-calculus is similar to the Au-calculus in that it preserves strong normalization and 
fails to have confluence in the presence of graftable meta variables. The two primary dif- 
ferences are that the As-calculus clearly separates the processes of substitution and renum- 
bering, and the As-calculus has more general notion of substitution. These differences are 
reflected in the syntax. 

Definition 4.1.3. The syntax of Xs -expressions is given by the following definition of terms, 
denoted a and b. 

a ::= n\ab\Xa\aa^b \ (pi. a 

Here n and i range over all positive integers and k over all non-negative integers. 

The term n represents the n*^ de Bruijn index. The term acr* 6 is called a closure and 
represents the substitution of a renumbered version of b for the i*^ de Bruijn index in a and 
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a-generation i^o,) b aa^ b 

a-X-transition {X a) b —> X {a a^~^^ b) 

a-app-transition (ai 02) cr* 6 — > (ai cr* b) (02 

{n — 1 if n > i 
ip^b \{ n = i 
n if n < i 
Lf-X-transition (/5^(Aa) ^ A a) 

(f-app-transition fli'^i «2) ^ (v^fcOi) ('/'fca2) 

n + i — 1 ifn>A; 



ip-destruction (p\ n 



n if n < /c 



Figure 4.2: Rewrite rules for the As-calculus 



a shifting down by one of all de Bruijn indices greater than i in a. The term ip\ a is called 
an update and represents an increase by i — 1 of all de Bruijn indices greater than k. All of 
these concepts can be translated into the suspension calculus by the following translation. 

Definition 4.1.4. The translation T from Xs-terms to suspension terms is defined by re- 
cursion as follows: For a term t, T[t) is if t is n, {T{a) T[b)) if t is {a b), XT{a) if t 
is X a, [r(a), i, i - 1, (#1, i - 1) :: (#1, i - 2) :: . . . :: (#1, 1) :: (r(6), 0) :: nil} if t is a a' b, 
and |r(a),fc,A; + i-l,(#l,fc + z-l) :: (#1, + i - 2) :: m/| if t is ifia. 

Theorem 4.1.4. For every Xs-term a, T[a) is a well-formed suspension term. 

Proof. The proof is by induction. □ 



The rules of the As-calculus are presented in Figure 4.2. We define the s rules to be 
all the rules of the As-calculus except a-generation. Because of the separation between 
substitution and renumbering, there is some redundancy in the rules, e.g. a-app-transition 
and ip-app-transition. But looking at a-X-transition and ip-X-transition, we see the benefit 
is that substitution and renumbering can have separate behaviors for descending underneath 
lambda abstractions. This cleanness in the rules allows for very well behaved translation. 

Theorem 4.1.5. Let a and b be Xs-terms such that a \>s b. Then T{a)\>^T{b). 

Proof. The proof is by case analysis on the rule used to transition from a to 6. □ 
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The above theorem tells us that the translation T is information and normal form 
preserving. Moreover, it shows us that the suspension calculus, even without merging, is 
capable of exactly simulating the As-calculus, and it gives us a proof that the s rules of 
the As-calculus are strongly normalizing since the reading (and merging) rules are strongly 
normalizing. In the original paper on the As-calculus a similar translation is proven from 
the As-calculus to the Ac-calculus, and it is through this translation that the strong nor- 
malization of the s rules is established. That the As-calculus translates so nicely into both 
the suspension calculus and the Acr-calculus is a strong argument that the calculus is well- 
designed and natural for representing single substitutions. 

Another point to note about the above theorem is that the suspension calculus may have 
to make multiple reading steps to simulate a single step in the As-calculus. The primary 
reason for this comes from the (r3) and (r4) rules of the suspension calculus which are used 
to evaluate the result of applying a suspension to a de Bruijn index. Because a suspension 
can representing a substitution for various different de Bruijn indices, these rules must check 
to see which substitution applies for a given index. In the As-calculus on the other hand, 
each closure represents a single substitution so when a de Bruijn index is encountered we 
can immediately check if it is the one being substituted for. This clearly gives a benefit in 
efficiency to the As-calculus, but this benefit is not enough to offset the benefit gaining by 
merging substitutions [LNQ04]. 

4.1.3 The Asg-calculus 

The Ase-calculus is an extension of the As-calculus in order to gain confluence in a setting 
with graftable meta variables. The calculus achieves this by allowing what some call merging 
of substitutions, but what is more accurately described as permutation of substitutions. To 
be precise, the Asg-calculus maintains the same syntax as the As-calculus and extends the 
rewrite rules with the six rules in Figure 4.3. 

Notice that each rule has careful restrictions on it to prevent looping behavior, but as 
shown in [GuiOO] this is not enough: the Asg-calculus fails to preserve strong normalization. 
Another more technical problem with the Asg-calculus is that normal forms when in a 
context of graftable meta variables can become unwieldy. A Asg-normal form has the same 
basic structure as a de Bruijn term, except that graftable meta variables can have sequences 
of closures and updates applied to them. The only restriction on these substitutions is that 
none of the Sg-rules apply to them [KR97] . This problem is brought to the forefront in the 
context of higher-order unification using the Asg-calculus, where graftable meta variables 
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if i < j 

iik<j<k + i 

k + i < j 
if i < + 1 
if / + i < 
if / < A: < / + j 

Figure 4.3: Additional rewrite rules for the Asg-calculus 

and normal forms play an important role in efficient unification procedures [ARK03]. 

4.2 A Calculus with Merging: the Aa-calculus 

The A(T-calculus supports a general notion of composition of substitution walks and there 
exists a variant of it which is confluent in a setting with graftable meta variables [ACCL91, 
CHL96] . Unfortunately, Mellies was able to demonstrate that the calculus lacks preservation 
of strong normalization by presenting a simply typed lambda term for which an infinite Xa- 
reduction path exists [Mel95]. 

We use the rest of this section to define the Acr-calculus and construct translations to 
and from the suspension calculus. 

Definition 4.2.1. The syntax of Xa- expressions is given by the following definition of 
terms, denoted a and b, and substitutions, denoted s and t. 

a ::= 1 | a6 | Aa | a[s\ 

s ::= id\]\a-s\sot 

The term a[s] is called a closure and represents the term a with some substitution s 
to be applied to it. The substitution id is the identity substitution. The substitution | is 
called shift and represents a increasing of all free de Bruijn indices by 1. The substitution 
a • s is called cons and represents a term a to be substituted for the first de Bruijn index 
along with a substitution s for the remaining indices. Lastly, the substitution sot represents 
the merging of the substitution s and t. 

Note that the terms in this calculus only contain the first de Bruijn index. All others 
are represented by 1[| o |], o |) o |] etc. We abbreviate the n^^ de Bruijn index 
as With this in mind, the rules for the Au-calculus are presented in Figure 4.4. 



a-a-transition {aa^ b) c ^ {aa^~^ c) a^{ba^~^'^ c) 

a-(f-transition 1 {ip\ a) b (f]^^ a 

a-ip-transition 2 (y?^ a) b ^ ipl.{a a^~^~^^ b) 

if-a-transition if\{a b) {fl+i o) (^H^l+i-j ^) 

(p-if -transition 1 V'UW ~^ ^Jifk+i-j 

ip-if-transition 2 '■Pki^l ^) ~^ W"*"*"^ 
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(Beta) 


(Aa)6 a[b ■ id] 






(App) 


(a b)[s] a[s] b[s] 


(Varld) 


l[id] 1 


(Abs) 


{Xa)[s] Xa[l ■ (s o j)] 


(VarCons) 


l[a ■ s] 


(Clos) 


a[s] [t] — > a[s o t] 


(ML) 


id o s — > s 


(Map) 


(a • s) o t — > a[t] - (sot) 


(Shiftid) 


1 o id ^ ' 


(Ass) 


(s o t) o u ^ s (^t o u) 


(ShiftCons) 


t o (a • s) 



Figure 4.4: Rewrite rules for the Ac-calculus 
4.2.1 Suspension Expressions to Ao"-expressions 

The translation from suspension expressions to Afi-expressions works by unfolding the infor- 
mation which is represented by the indices and embedding levels of the suspension calculus 
into individual substitution operations of the Ac-calculus. Accounting for this, the rest of 
the translation is straightforward and translates suspension expressions into corresponding 
Aa-expressions: suspension to closure, nil to id, cons to cons, and merged to merged. Be- 
sides the difference in representing renumberings, the syntax of the two calculi match up 
nicely. 

Definition 4.2.2. The translation S from suspension terms to Xa-terms and the translation 
R from pairs of a suspension environment and a new embedding level to Xa -substitutions 
are defined simultaneously by recursion as follows: 

1. For a term t, S{t) is 1 if t is #1, l[r] if t is #(n + 1) with n>l, (5(a) Sib)) if t 
is {a b), XS{a) if t is Xa, and S{t')[R{e,nl)] if t is [t,o/,n/,e]. 

j 

. ^ 

2. For an environment e and natural number j, R{e,j) is {. . . (id o |) o |) o . . .) if e is 

j-n 

, ^ 

nil, (. . . {{S{t) ■ R{e', n)) o |) o |) o . . .) if e is [t, n) :: e' , and R{ei,nli) o R{e2,j — 
(nh-oh)) if e is {{ei, n/i, 0/2, 62}}. 

The translation from suspension expressions to Au-expressions includes a translation 
R{e,j) which translates the environment e relative to the embedding level j. Restrictions 
must be placed on this translation to ensure the definition is well-formed. For example, 
looking at the second case for R{e,j), one might worry that j < n in which case having 
{j — n) shifts does not make sense. We can ensure this never happens by requiring that 
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lev{e) < j every time R{e,j) is called. Enforcement of this is provided by the wellformedness 
properties of suspension terms. 

Theorem 4.2.1. If t is a suspension term then S{t) is well-defined. 

Proof. The property must be proved simultaneously with the property that if e is a sus- 
pension environment and j is an integer with j > lev{e) then R{e,j) well-defined. □ 

Due to occurrences of the identity substitution and small differences in associativity, 
the A(T-calculus does not simulate the suspension calculus. Instead, we show that normal 
forms are preserved by the translation. 

Theorem 4.2.2. Let a and b be suspension terms such that a[>rmb. Then there exists a 
Xa-term t such that S{a) >* t and S{b) >* t. 

Proof. The proof uses the dual property that if a and b are suspension environments and j is 
an integer such that j > lev{a) then R{a, j)t>*^R{b, j) . We can then prove both properties 
by case analysis on the rule used to transition from a to 6. □ 

Finally, we argue that S is information preserving by showing that it is one-to-one. Note 
that this property is quite strong since we are translating from a calculus with merging 
to another calculus with merging. Before, when we translated from a calculus without 
merging to one with merging, this property was more obvious since only the substitution 
representations might overlap. Here we must worry about substitution representations and 
also merged substitution representations. 

Theorem 4.2.3. The translation S is one-to-one. 

Proof. The result follows easily from noticing that R{e,j) can never equal j'^ for any e, j, 
and k. □ 

Because the Acj-calculus and the suspension calculus seem to be equally expressive we 
can define a translation in the other direction in the following section. 

4.2.2 Acr-expressions to Suspension Expressions 

The translation from Au-expressions to suspension expressions proceeds in the obvious way 
except for a special case when translating so] which changes the shift substitution into 
the corresponding renumbering concept expressed in embedding levels and indices. 
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Definition 4.2.3. The translation T from Xa-terms to suspension terms and the translation 
E from Xa- substitutions to triples of an old embedding level, a new embedding level, and a 
suspension environment are defined simultaneously by recursion as follows: 

1. For a term t, T{t) is #1 if t is 1, #(n + 1) if t is l[r], {T{a) T{b)) if t is (a b), 
XT{a) iftisXa, and \r{a),ol^nl,e\ if t is a[s\ where {ol,nl,e) = E{s). 

2. For a substitution s, E{s) is {0,0, nil) if s is id, {0,1, nil) if s is ], 

{ol + l,nl, {T{a),nl) :: e) if s is a-s' where {ol, nl,e) = E{s'), {ol, nl + l,e) if s is s' o ] 
where {ol, nl, e) = E{s'), and {oli + {0I2 - nli),nl2 + {nli - 0I2), |^ei, n/i, 0/2, 62 J) if s 
is si o $2 where {oli,nli, ei) = E{si) and {ol2,nl2, 62) = E{s2). 

If more than one case apply to the same expression, we require that first one listed is the 
one used. 

Theorem 4.2.4. For every Xa-term a, T{a) is a well-formed suspension term. 

Proof. The proof is by induction using the dual property that for every Aa-substitution 
s such that {ol,nl,e) = E{s) we have ol = len{e), nl > lev{e), and e is a well-formed 
suspension environment. □ 

The suspension calculus is not capable of simulating the Aa-calculus and this is not a 
bad property. If the suspension calculus were able to simulate the Au-calculus, then the 
suspension calculus would be able to simulate the Mellies counterexample to preservation 
of strong normalization [Mel95]. The primary reason for the lack of simulation is the 
Ass rule which establishes an association rule for merged substitutions. In the suspension 
calculus, however, such association is a property we have proven with significant work, see 
Section 3.8.1, but it is not a rule of the calculus. Instead of showing simulation, we show 
normal forms are preserved by the translation. 

Theorem 4.2.5. Let a and b be Xa-terms such that at>ab. Then there exists a suspension- 
term t such that T{a)\>*^t and T{b)\>*^t. 

Proof. A naive approach to this theorem would be filled with special cases to account for 
the special cases present in the translations T and E. In order to avoid this note that in 
the special case of r(l[|"]) = ^(n + 1) if we had used the more general translation of a[s] 
we would have produced l^^l, 0, n, ni/]. Since these two terms are [>*^-convertible we can 
pick the second one for this theorem and ignore the special case. The same result holds for 
the special case of E{s o |). 
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The other difficulty in proving this theorem is that we will need a corresponding property 
for Ac7-substitutions. Naively, this property might be that ii s t then the old and new 
embedding level components of E{s) and E(t) are equal and the environment components 
rewrite to a common environment. This will fail because of the (Map) rule in the Aa-calculus 
which has the form {a-si)os2 a[s2] •{sios2). Letting t = T{a), {oli, nli, ei) = E{si), and 
(0/2,^^^2562) = E{s2), the environment components of the translation E applied to the left 
and right sides of the (Map) rule are ^{t,nli) :: ei, n/i, 0/2, e2§ and ([t, 0Z2, ''^/2^ 62], n/2 + 
[nil -0I2)) ^ei,nli,ol2,e2^, respectively. Note that this is very similar to our rule (m6) 
but different in that 62 might not have the form {s,l) :: 62 and also we use n/2 instead of 
the level of 62- Because of these problems, these two environments are not rewritable to a 
common environment. Instead, we generalize the property to state that the environment 
components should be rewritable to similar environments, see Section 3.11, from which the 
result follows. □ 

The translation T is not one-to-one because the Au-substitutions | and id o ] translate 
to the same tuple. We can, however, prove that this translation is a left-inverse of the 
translation S from suspension term to Ac-terms. Because the translation S is information 
preserving, this result is strong evidence that T is also information preserving. Moreover, 
this result shows that our translations are well balanced and therefore, hopefully natural. 

Theorem 4.2.6. For every suspension term t, T{S{t)) = t. 



Proof. The proof is by induction using the dual property that for every suspension en- 
vironment e and integer j such that j > lev{e), we have E{R{e,j)) = {ol,j,e) where 
ol = len{e). □ 



Chapter 5 



Preservation of Strong Normalization 

Normal forms hold a special place in the lambda calculus. The normal form of a term has the 
same meaning as the original term without any /J-contraction work left. This makes normal 
forms an ideal basis for unification procedures over the lambda calculus. By focusing on 
normal forms, such procedures can ignore the /3-contraction aspect of the lambda calculus, 
and instead focus just on the binding structure of normal forms. 

Because of this lofty position, much work has been put into determining when normal 
forms exist and how to compute them. For instance, a strong motivation behind the simply 
typed lambda calculus in Section 2.3.2 is that normal forms are guaranteed to exist for all 
terms in the calculus. Such terms are called normalizable. In fact, any reduction of a term 
from that calculus is finite and must reach the normal form, prompting the title strongly 
normalizable. This is not always the case for other variants of the lambda calculus. In those 
instances, a reduction strategy must be carefully chosen which reduces a term to its normal 
form, provided one exists. A well known strategy which has this behavior is called normal 
order reduction and consists of always contracting the leftmost outermost /3-redex, which 
we will call the leading redex. The fundamental reason why this strategy works is that 
contracting anything other than leading redex will not affect the existence of the leading 
redex. Thus the leading redex will persist forever unless we contract it, and therefore we 
choose to contract it first. In this chapter we look at generalizations of these notions of 
normalizability, strong normalizability, and reduction strategies to the context of explicit 
substitution calculi. 

5.1 Preservation of Normalizability 

Explicit substitution calculi are elaborations of the lambda calculus and as such they pre- 
serve normalizability. That is, given a normalizable term in the lambda calculus, it remains 
normalizable in an explicit substitution calculus. The reason is is that each /3-contr action 
step in the lambda calculus can be matched in an explicit substitution calculus by a sim- 
ulated /3-contraction step followed by a series of substitution steps. This approach to 
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computing normal forms is logically sound, but it removes the practical benefits of using 
an explicit substitution calculus. We can make two improvements on it. 

The first improvement is actually one which can be realized in the lambda calculus. 
Because we are often interested in finding normal forms for the purpose of unification, 
we can develop a weaker notion of normal form which does not require as much work 
to compute, but still suffices for the purpose of unification. Such a notion is capture by 
reducing terms to the form (A A . . . \{h ti ... tn)) where h is either a constant, a de Bruijn 
index, or a meta variables. This is called a head normal form. Another term in head normal 
form then unifies with this one if and only if it has the form (A A . . . \{h si ... s„)) where 
the number of leading lambdas is the same and ti unifies with Si for \ < i < n. This 
method saves us from having to normalize the terms ti, . . . , t„, si, . . . , s„ in the case that 
unification fails. A simple and complete procedure for computing head normal forms is to 
perform normal order reduction until a head normal form is reach and then stop there. 
This is called head normalization and in this case the leading redex is referred to as the 
head redex. 

The second improvement we can make is to generalize the notion of head normalization 
to the explicit substitution context. Nadathur has made such a generalization in the case of 
the suspension calculus and has proven that such a procedure always finds the normal form 
of a normalizable term [Nad99]. The idea behind Nadathur's notion of head reduction is to 
define a head redex as either the head /3-redex or a ^rm-redex which occurs above the head 
/?-redex. Generalized head reduction then consists of contracting head redexes until no more 
head reductions are possible. This generalization provides more freedom in computing head 
normal forms, but it retains the essential property of head reduction: that a head normal 
form is always reached in a finite number of steps, assuming one exists. To see why this is 
true, notice that the reading and merging rules are terminating, so any infinite reduction 
would have to consist of infinitely many {j3s) applications on head redexes. But we can map 
each of these applications to the contraction of the corresponding /3-redex in the lambda 
calculus. This is as simple as taking the c>rm-iiormal form before and after applying {(3s)- 
Thus any infinite generalized head reduction sequence in the suspension calculus can be 
mapped onto an infinite head reduction sequence in the lambda calculus. 

5.2 Preservation of Strong Normalization 

A more complicated issue is whether a strongly normalizing term in the lambda calculus 
remains strongly normalizing within an explicit substitution calculus. A calculus is said to 
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have preservation of strong normalization (PSN) if this is the case for every strong normal- 
izing term. This is a desirable property because it speaks to the coherence of the calculus. 
Intuitively one might expect this property to always hold since explicit substitution cal- 
culi are elaborations of the lambda calculus, but this is not the case. The interaction of 
contraction and substitution rules in explicit substitution calculi creates the possibility of 
reduction sequences which do not correspond to reductions in the lambda calculus. The 
existence of such reduction sequences speaks poorly for the structure of an explicit substi- 
tution calculus. In the remainder of this chapter, we focus on the issue of preservation of 
strong normalization in various explicit substitution calculi. 

5.3 PSN in Calculi without Substitution Interaction 

In calculi without rules for interactions between substitutions, preservation of strong nor- 
malization is usually true. In such calculi, substitutions are generated by a step which 
simulates /3-contraction and then those substitutions are pushed down through the term 
until they can be evaluated. Because of this essentially linear path, it is very easy to take 
an arbitrary substitution and determine which /3-contraction generated it. By connecting 
these /3-contractions in the explicit substitution calculus to /3-contractions in the lambda 
calculus, any infinite reduction sequence in the former can be mapped onto one in the latter. 

To take an example, we consider here the proof of preservation of strong normalization 
for the As-calculus (see Section 4.1.2). In this setting a closure refers to a term of the 
form aa^b and the inside of a closure refers to the term b. Suppose we have an infinite 
reduction of some term in this calculus. We know that the s rules of the As-calculus are 
strongly normalizing, so the infinite reduction must contain infinitely many contractions of 
/9-redexes using the cj-generation rule. At each step of this infinite reduction, we can look 
at the s-normal form of the current term to see what progress is being made with respect to 
the lambda calculus. Clearly each step which uses an s rule does not change the s-normal 
form. For the cj-generation steps, some may change the s-normal form and some may leave 
it the same. Those that change the s-normal form correspond to /5-contractions in the 
lambda calculus. If there are an infinite number of such steps then we can use the s-normal 
forms as an infinite reduction sequence in the lambda calculus. The other possibility is that 
only finitely many c-generation steps correspond to changes in the s-normal form. Now 
any cr-generation step which occurs at the top level (outside of any closures) will be one 
of these steps which changes the s-normal form, and therefore only finitely many of our 
(T-generation steps occur at the top level. Because there are only finitely many such steps. 
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we can find a point in our infinite reduction at which ah cj-generation steps occur inside 
of closures. By the infinite pigeonhole principle, there must be one closure which contains 
an infinite reduction inside it. Thus we have reduced our infinite reduction sequence to an 
infinite reduction which occurs entirely within a single closure. 

The next key step in the proof is to trace each closure back to the /3-redex from which 
it was created. This is possible since only the cr-generation rule can create closures. Using 
this idea we can take our infinite reduction which occurs within some closure, say a a* b, and 
know that it came from a term of the form ((A a') b') where b' rewrites to b. Now instead of 
contracting the /3-redex we can follow the infinite reduction path which exists for b' . Because 
this b' is no longer inside of a closure, the cr-generation steps inside it will correspond to 
reductions in the lambda calculus for the s-normal form. By the same reasoning we have 
followed so far, what must occur is that this b' eventually generates a closure which contains 
an infinite reduction. But this closure can again be unwound and mapped into a reduction 
sequence in the lambda calculus. By repeating this process indefinitely we generate an 
infinite reduction sequence in the lambda calculus. 

5.4 Problems for Calculi with Substitution Interaction 

Some calculi have rules of interactions between substitutions, the nature of which depend 
on the motivation for including them in the calculus. One motivation for substitution 
interaction is to regain confluence in a setting with graftable meta variables. For instance, 
in the As-calculus, consider the term {{\X) b) c where X is a graftable meta variable 
and b and c are arbitrary terms. On the one hand we can contract the redex to obtain 
{X b) cj* c, while on the other we can first distribute the substitution and then perform 
the reduction to generate {X a'^^^ c) {b a"^ c) . These two terms cannot be reduced to a 
common term because X is a graftable meta variable. In order to fix this, the Asg-calculus, 
see Section 4.1.3, extends the As-calculus and introduces rules for interactions between 
substitutions [KR97]. One of these rules deals exactly with the case we have, 

a-a-transition {a cj* b) c —> {a a^~^^ c) cr*(6 a^~'^~^^ c) if i < j 

Applying this rule reconciles the two reductions into the term {X cr^'^^ c) cr^(6c7* c). 

The danger in admitting a permutation rule is that once we have permuted two substi- 
tutions, we might try to permute them again. The Asg-calculus tries to avoid this situation 
by placing side conditions on the permutation rules so that one substitution can only be 
permuted inside another if the outside one represents a contraction from higher up in the 
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term than the inner substitution. For example, in a term of the form ((A ... (A a) b . . .) c) 
we ahow the contraction of the outer redex to be permuted inside the contraction of the 
inner redex. An additional wrinkle, however, is that we must also add rules for interactions 
with updating functions. It is exactly these additional interaction rules which causes the 
Asg-calculus to lose preservation of strong normalization, as proved by Guillaume [GuiOO]. 

Guillaume and David solved this problem by introducing the A^g-calculus which replaces 
update functions with labels representing the renumbering to be done [DGOla]. These la- 
bels are then part of the normal forms of terms in the calculus since there are no rules for 
propagating their effects. Thus this calculus corresponds to a version of the Asg-calculus 
where restrictions are placed on the ability to propagate updating functions. These re- 
strictions are not so severe that confluence in presence of gr affable meta variables is lost. 
Furthermore, with these restrictions, Guillaume and David are able to show that if a term 
has an infinite reduction sequence then contracting and propagating a leading redex pre- 
serves that infinite reduction sequence. Using this they map every infinite reduction in the 
At„s-calculus into an infinite normal order reduction sequence in the lambda calculus, thus 
proving preservation of strong normalization. The key to using the leading redex is that it 
is above every other substitution which may be encountered during propagation. Thus it 
can be permuted inside of those substitutions without disturbing their effect on the infinite 
reduction. 

A different approach to substitution interaction is to allow full combination of substi- 
tutions such as in the suspension calculus and the Ac-calculus [ACCL91]. The benefit of 
this approach is that the resulting calculus is often very efficient for direct implementation. 
Additionally, the merging rules for these calculi are usually strong enough that confluence in 
the presence of graftable meta variables can be recovered. The downside of allowing merging 
is that it creates new interaction possibilities for substitutions, and this may lead to the loss 
of preservation of strong normalization. Such is the case in in the Afi-calculus where Mellies 
demonstrated a strongly normalizing term along with an infinite Au-reduction [Mel95]. 

The essential problem with merging in the Acj-calculus is that superfluous terms can 
be generated and then allowed to interact with other substitutions. For instance in a 
substitution of the form | o (a • s) we know that the term a is going to be eventually 
pruned by the shift. However if we have an outer substitution applied to this substitution, 
(I o (a • s)) o r, then we can apply the association rule for merged environments to rewrite 
this term to | o ((a • s) o r). From here we can map the substitution r onto a to yield a[r] 
which is again superfluous, but may contain signiflcant reduction work. 
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This idea is played out in completely in the Mellies counterexample which begins with 
a term of the form ((A a) b)[s] and rewrites it so that the substitution s is able to interact 
with a version of itself. 

((A a) b)[s] i>App (Aa)[s] b[s] 

>Abs Aa[l • (s o -f)] b[s] 

>Beta a[l ■ (s o ])][b[s\ ■ id\ 

>cios a[(l • {s o t)) o {b[s\ ■ id)] 

>Map a[l[6[s] • id] ■ {{s o |) o {b[s] ■ id))] 

Oass a[l[b[s] ■ id] • (s o (t o {b[s] ■ id)))] 

From here on we can focus solely on the substitution s o o {b[s] ■ id)). Notice that at this 
point the b[s] component here is vacuous. Because of the |, the b[s] should be removed as 
soon as we apply ShiftCons. Unfortunately, the rules of the Acr-calculus allow us to play 
with this vacuous term and produce an infinite sequence. If we consider that s might be 
of the form ((A a) b) ■ id and if we abbreviate (t o {b[s] ■ id)) as s' then we can rewrite the 
term s o (| o {b[s] ■ id)) as follows. 

s o (t o {b[s] • id)) = (((A a) b) ■ id) o s' 

>Map ((A a) b)[s'] ■ {id o s') 
>HL((Aa) b)[s']-s' 
>App{{\a)[s'] b[s'])-s' 
>Abs (Aa[l-(s' oT)] b[s'])-s' 

>Beta {a[l ■ {s' o ^)][b[s'] ■ id]) ■ S' 

>cios (a[(l • is' o T)) o {b[s'] ■ id)] ■ s' 

>Map a[l[b[.s'] ■ id] ■ {{s' o t) o {b[s'] ■ id))] ■ S' 

>Ass a[l[b[s'] ■ id] ■ {s' o {] o {b[s'] ■ id)))] ■ s' 

Here we again have a subterm of the form s' o (| o {b[s'] - id)). Using this we can repeat the 
above reasoning to produce an infinite sequence. 
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5.5 Status of PSN for the Suspension Calculus 

Preservation of strong normalization for the suspension calculus is an open problem. In this 
section we explain why the counterexample from the Acr-calculus and the proof techniques 
of the As-calculus are insufficient in resolving this problem. 

To start, consider how the counterexample from the Au-calculus would proceed in 
the suspension calculus. The term ((A a) b)[s] in the Acr-calculus corresponds to a term 
|(Aa) 6, o/,n/,e]. Then the reduction can proceed as follows. 

[(A a) b, ol, nl, e] 

l>(f,5) lXa,ol,nl,el [6, oZ,nZ,e] 

t>(r6) (A[a,o/ + l,nl + 1, + 1) :: el) lb,ol,nl,e} 

>Ws) [[a,o/ + l,n/ + l,(#l,nZ + l) :: e], 1, 0, o/, n/, e], 0) :: nilj 
>{mi) + l,nl, f{#l,nl + 1) :: e,nl + 1, 1, ([6, o/, n/, e], 0) :: nil}} 

>(m6) la,ol + l,nl, ([#1,1,0, {lb,ol,nl,ej,0) :: nilj,nl) :: 

{{e, nl + 1, 1, ol, nl, ej,0) :: nil}} 
I>(m5) [«) ol + 1, nl, ([#1, 1, 0, {{b, ol, nl, e], 0) :: nil},nl) :: §e, nl, 0, nil}} 
t>(m2) la,ol + l,n/, ([#1,1,0, ([6,o/,n/,el,0) :: nil},nl) :: e] 

At this point we can focus on the first term in the environment and reduce it as follows. 

[#1, 1,0, ([6, ol, nl, e} , 0) :: nil} >(,3) [[6, ol, nl, e} , 0, 0, nil} 

>imi) {b, ol, nl, le, nl, 0, nil}} 
l>(m2) lb,ol,nl,e} 

This leaves the original term as la,ol + l,nl, {lb,ol,nl, e},nl) :: e]. There doesn't appear 
to be any means for an infinite reduction from this since we don't have the environment 
e acting on itself, as was the case in the Au-calculus. Instead let us reconsider the steps 
we took in producing this term and ask if we could have chosen a different reduction path 
once we merged the two environments. The answer is that there can be no other reduction 
path for this merged environment in this case or for any merged environment in the general 
case. Looking at the rules which operate on merged environments, we see that there are 
no choices in which rule can be applied at a given stage except for the trivial overlap 
between (m2) and (m3). Thus the merging process is deterministic. Furthermore, given an 
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environment of the form |ei, nZi, 0Z2, 62], there are no rules which allow the outside context 
of this environment to have an effect on ei or 62, until after the merging is performed. In 
this way, the merging process of the suspension calculus can be viewed as an atomic action. 
We can conceivably imagine replacing (ml)-(m6) with a single merging rule which rewrites 
a term of the form oli,nli, ei], ol2,nl2, 62] to one of the form {t, ol', nl' , e'j where e' is a 
simple environment. The benefit of the separate rules (ml)-(m6) is that this large merging 
operation is done in a lazy fashion. 

On the other hand, we can think of extending the proof of preservation of strong normal- 
ization for the As-calculus to apply to the suspension calculus. In the As-calculus we are able 
to trace each closure back to the /9-redex which created it, because only the /3-contr action 
rule can generate closures. In the suspension calculus, environment terms can be created 
either by /3-contraction or by merging of substitutions. Thus tracing an environment term 
back to a single /9-redex in the lambda calculus is extremely difficult. Furthermore, the 
As-calculus had a fairly linear order in generating and propagating substitutions, while the 
suspension calculus has the possibility that propagating one substitution might mean merg- 
ing with other substitutions along the way. This allows the suspension calculus more much 
freedom in choosing reduction paths, but it also makes mapping those reduction paths onto 
the lambda calculus significantly more difficult. 



Chapter 6 



Conclusion 

In this thesis we have presented a version of the suspension calculus which combines the de- 
sirable theoretical properties of the original suspension calculus with the practical benefits 
of the derived suspension calculi. This new version is created not by adding more to the 
calculus, but by simplifying what is already there. This simplification has the additional 
benefit of rationalizing the structure of the calculus, making it possible to easily superim- 
pose additional logical structure over it. We have illustrated this capability by showing how 
typing in the lambda calculus can be treated in the resulting framework and by present- 
ing a natural translation into the Ac-calculus. We have also shown how the substitution 
mechanism supports combination of substitution walks while remaining confluent and ter- 
minating. Building on this, we have proven that the full suspension calculus is confluent 
even in the presence of graftable meta variables. The question of preservation of strong 
normalization relative to the suspension calculus remains open. However, we conjecture 
that it is true and we have presented arguments as to why this belief might be correct. If 
this property is indeed true then it would make the suspension calculus the only explicit 
substitution calculus which possesses the three properties deemed to be most desirable. 

Another contribution of this thesis is a survey of the realm of explicit substitution calculi. 
We have utilized the suspension calculus in this process. In particular, we have described 
translations between other popular calculi and the suspension calculus towards understand- 
ing and contrasting their relative capabilities. To give substance to this approach, we have 
established relevant properties of the translations such as their correctness and their ability 
to preserve important computational properties of the calculi they relate. 

This thesis would be incomplete without a discussion of the possible ways of building 
on the results it presents. 

Preservation of Strong Normalization 

Preservation of strong normalization is a problem of signiflcant theoretical interest because 
it speaks to the coherence of the calculus. As already mentioned, we believe that it can 
be proven true in the case of the suspension calculus. The basis of this belief is that the 
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merging (or permutation) of substitutions which has caused the property to fail in other 
calcuU is handled correctly in the suspension calculus. Whereas the Ao"-calculus, the only 
other calculus that allows combination of substitutions, allows us to make choices in how to 
unravel substitution combination, the suspension calculus treats substitution combination 
as a deterministic and pseudo-atomic function. While this intuition appears to be accurate, 
working it out into a proof has been difficult. In particular, reflecting an arbitrary reduction 
sequence in the suspension calculus into one in the lambda calculus appears complicated 
but this seems to be necessary to show that infinite sequences in first context must be 
matched by infinite ones in the second. Nevertheless, we believe that the simpler set of 
combination rules gives us a better handle on this matter and hence are hopeful of using it 
to construct an actual argument. 

Higher-Order Unification using the Suspension Calculus 

As mentioned in the introduction, one benefit of using explicit substitutions is that it allows 
substitution notions to be actively used by processes that operate on the lambda calculus 
such as unification. Recent work has exploited this feature in producing a higher-order 
unification procedures based on a variant of the Afi-calculus which supports graftable meta 
variables [DHK95] . The original suspension calculus also supports graftable meta variables 
and so this unification idea could have been worked out in its context as well. However, 
the incentive for doing this has been small because the complexity of its combination rules 
limits the benefit of doing this in actual implementations. Derived calculi based on the 
suspension calculus simplify these combination rules into a couple rules which are useful 
for head normalization, but these calculi are not confluent when graftable meta variables 
are added. By contrast, the suspension calculus presented in this thesis has the property 
of confluence even in the presence of graftable meta variables and also has a collection of 
combination rules that is simple enough to use directly in an implementation. The benefit 
of developing the new approach to unification based on the suspension calculus is that it 
treats renumbering in a more efficient manner than the Ac-calculus and so a higher-order 
unification procedure based on the suspension calculus is likely to have better behavior in 
practice. 

Compilation of Strong Reduction 

Functional programming languages use a notion of reduction where an expression that has a 
top level abstraction is treated as a value. This form of reduction, where it is unnecessary to 
look underneath abstractions, is called weak reduction. Weak reduction is easily performed 
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in an interpretive setting by keeping an environment which tracks variable bindings. It 
is also possible to compile weak reduction and the Categorical Abstract Machine which 
underlies the Objective Caml programming language provides a framework for doing exactly 
this [CCM87]. In the representational use of the lambda calculus it may be necessary to 
compare underneath lambda abstraction leading to the need to perform reductions even in 
such contexts. This is called strong reduction. Explicit substitution calculi provide a basis 
for realizing strong reduction and in fact an interpreted approach has been developed based 
on the suspension calculus and used in a AProlog implementation [LNQ04]. An approach 
to using a compilation based realization of strong reduction has also been described in the 
context of the Coq system [GL02]. However, this approach is somewhat ad-hoc and is based 
on repeated calls to the reduction machinery underlying the categorical abstract machine. 
We believe a uniform compilation model can be developed using an explicit substitution 
notation such as the suspension calculus. 
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